ocelot.routines package

Submodules

ocelot.routines.conformerparser module

class ocelot.routines.conformerparser.ACParser(ac: numpy.ndarray, charge, atomnumberlist, sani=True, apriori_radicals=None)[source]

Bases: object

BO_is_OK(BO, DU_from_AC, atomicNumList, charged_fragments, valences)[source]

check bond order matrix based on

Parameters
  • BO

  • DU_from_AC – based on valences arg

  • atomicNumList

  • charged_fragments

  • valences – valence assignment related to current BO

Returns

__init__(ac: numpy.ndarray, charge, atomnumberlist, sani=True, apriori_radicals=None)[source]
Variables

self.valences_list – a list of possible valence assignment, valences_list[i] is one possbile way to assign jth atom

valence based on valences_list[i][j]. :var self.atomic_valence_electrons: atomic_valence_electrons[i] is the #_of_ve of ith atom :var self.apriori_radicals: a dict to mark the atoms that can will have a lower valence in generating BO

addBO2mol(rdmol, BO_matrix, charged_fragments, force_single=False)[source]
static getUADU(maxValence_list, valence_list)[source]

get unsaturated atoms (UA) and degree of unsaturation (DU) between two possible assignments

Parameters
  • maxValence_list

  • valence_list

Returns

static get_BO(AC, DU_init, valences, UA_pairs)[source]

for a valence assignment, get BO BO[i][j] is the bond order between ith and jth AC is a BO with all single bond the algo is to increase bond order s.t. degree of unsaturation (DU) does not change notice DU is calculated based on the given valences

Parameters
  • DU_init

  • AC

  • valences

  • UA_pairs

Returns

static get_UA_pairs(UA, AC)[source]

find the largest list of bonds in which all atom appears at most once

Parameters
  • UA

  • AC

Returns

static get_atomic_charge(atomic_number, atomic_valence_electrons, BO_valence)[source]

atomic charge from #_valence_electrons - bond_order #TODO test robustness

Parameters
  • atomic_number

  • atomic_valence_electrons

  • BO_valence

Returns

static get_bonds(UA, AC)[source]

get a list of unique bond tuples (i, j) between UAs

Parameters
  • UA

  • AC

Returns

get_valence_info()[source]
init_rdmol()[source]
parse(charged_fragments=False, force_single=False, expliciths=True)[source]
parse_bonds(charged_fragments)[source]

find the best BO

Parameters

charged_fragments

Returns

set_atomic_charges(mol, BO_valences, BO_matrix)[source]
set_atomic_radicals(mol, BO_valences)[source]
static valences_not_too_large(BO, vs)[source]
ocelot.routines.conformerparser.chiral_stereo_check(mol)[source]
ocelot.routines.conformerparser.clean_charges(mol)[source]
ocelot.routines.conformerparser.pmgmol_to_rdmol(pmg_mol)[source]
ocelot.routines.conformerparser.valence_electron(element)[source]

count valence electrons based on electronic configuration if a subshell has > 10 electrons, this subshell is ignored

ocelot.routines.disparser module

class ocelot.routines.disparser.AsymmUnit(psites: [<class 'pymatgen.core.sites.PeriodicSite'>])[source]

Bases: object

__init__(psites: [<class 'pymatgen.core.sites.PeriodicSite'>])[source]

an asymmetric unit with paired disorder units

Parameters

psites

# :param supress_sidechain_disorder: can be a function takes a psite and returns bool, # if True the psite will always be considered as a non-disordered site

infodict(is_sidechain=None)[source]
class ocelot.routines.disparser.AtomLabel(label: str)[source]

Bases: object

__init__(label: str)[source]

a class for atom label in cif file, overkill I guess…

Parameters

label

classmethod from_psite(s: pymatgen.core.sites.PeriodicSite)[source]
get_labels_with_same_ei(als)[source]

get a list of atomlabel whose ei == al.ei

Parameters

als

Returns

static get_labels_with_tag(tag, als)[source]
static get_psite_by_atomlable(psites: [<class 'pymatgen.core.sites.PeriodicSite'>], al)[source]
class ocelot.routines.disparser.ConfigConstructor[source]

Bases: object

static build_supercell_full_disorder(pstructure, scaling_matrix)[source]

get a supercell with all disordered sites inside, should be used to generate a certain config based on instruction

Parameters
  • pstructure

  • scaling_matrix

Returns

static dissc_to_config(sc: pymatgen.core.structure.Structure, inv_labels: [<class 'str'>], disportions, instruction)[source]

take instructions to generate a certain config from super cell

static gen_instructions(asym_portions, n_asym, n_cell)[source]

get all possible instructions, notice this is exponentially scaled

# of configs = 2^n_port*n_asym*n_cell where 2 comes from pairwise disordered occupancies

class ocelot.routines.disparser.DisParser(cifstring: str)[source]

Bases: object

__init__(cifstring: str)[source]

this can only handle one alternative configuration for the asymmetric unit

one asymmetric unit == inv_conf + disg1 + disg2

disg1 = disunit_a + disunit_b + …

disg2 = disunit_a’ + disunit_b’ + …

DisorderPair_a = disunit_a + disunit_a’

if the cif file contains previouly fitted occu and disg, we call it dis-0 and we use occu/disg info to get inv_conf, disg1, disg2

if there is no previously fitted info, we deal with the following situations:

dis-1: set(self.tags) is [“”, <non-word>, <word>, …], EI<non-word> – EI,

note x17059.cif is dis-1 but it has hydrogens like H12A – H12D, this can only be captured by previously fitted disg and occu,

dis-2: set(self.tags) is [“”, <non-word>, <word>, …], EI<non-word> – E’I’ e.g. ALOVOO.cif

nodis-0: no dup in self.eis, set(self.tags) is {“”}

nodis-1: dup in self.eis, set(self.tags) is [“”, <word>, …], this could be a dis as in ASIXEH

weird: else

for dis-1, dis-2, we fill the occu, disg fields in cifdata, so they can be coonverted to dis-0

Attributes:

data[atomlabel] = [x, y, z, symbol, occu, disgrp] this will be used to write config cif file

classify()[source]

one cif file belongs to one of the following categories:

dis-alpha: was fitted with, at least, disorder group

nodis-0: no dup in eis, one type of unique tag

dis-beta: dup in eis, EIx – EIy, x or y can be empty, e.g. c17013.cif

dis-gamma: no dup in eis, EIx – EJy, x or y can be empty, e.g. ALOVOO.cif

weird: else

note x17059.cif has hydrogens like H12A – H12D, this can only be captured by previously fitted disg and occu (dis-a), in general there should be a check on whether disgs identified by the parser are identical to previously fitted

static data2cifdata(data, cifdata)[source]

data[atomlable] = x, y, z, symbol, occu, group

classmethod from_ciffile(fn)[source]
get_nearest_label(ali: ocelot.routines.disparser.AtomLabel, neighbor_labels: [<class 'ocelot.routines.disparser.AtomLabel'>], lattice: pymatgen.core.lattice.Lattice, cutoff=2.5)[source]

given a list of potential neighbouring AtomLable, get the one that is closest and has the same element

get_psites_from_data()[source]

get psites from self.data, each psite will be assigned properties with fields occu disg label

Returns

static get_site_location_by_key(psites: [<class 'pymatgen.core.sites.PeriodicSite'>], key='label')[source]

loc[key] = ‘bone’/’sidechain’ must have ‘imol’ ‘disg’ ‘siteid’ and <key> assigned

static get_vanilla_configs(psites: [<class 'pymatgen.core.sites.PeriodicSite'>])[source]

must have ‘disg’ in properties

prepare_data()[source]
to_configs(write_files=False, scaling_mat=1, 1, 1, assign_siteids=True, vanilla=True, supressdisorder=True)[source]
return

pstructure, pmg structure is the unit cell structure with all disordered sites unwrap_str, pmg unwrap structure, with all disordered sites mols, a list of pmg mol with disordered sties confs, [[conf1, occu1], …], conf1 is a clean structure

exception ocelot.routines.disparser.DisorderParserError[source]

Bases: Exception

class ocelot.routines.disparser.DisorderUnit(psites: [<class 'pymatgen.core.sites.PeriodicSite'>], disg: str)[source]

Bases: object

__init__(psites: [<class 'pymatgen.core.sites.PeriodicSite'>], disg: str)[source]

a portion of an asymmetric unit representing one possibility

a pair of DisUnit is the basis of disorder, assuming maximal entropy

static find_counterpart(u1, u2s)[source]
property geoc

in cart

is_likely_counterpart(other)[source]

ocelot.routines.disparser_functions module

exception ocelot.routines.disparser_functions.CifFileError[source]

Bases: Exception

ocelot.routines.disparser_functions.apply_symmop(psites, ops)[source]

symmop and xyz in cif file:

lets say xyz – op1 –> x’y’z’ and xyz – op2 –> x!y!z! and it is possible to have x’y’z’ is_close x!y!z!

this means one should take only x’y’z’ or x!y!z!, aka op1 is equivalent to op2 due to the symmetry implicated by xyz/the asymmectric unit, e.g. ALOVOO.cif – Z=2, asymmectric unit given by cif is one molecule, but there’re 4 ops

so we need first check if the cif file behaves like this

ocelot.routines.disparser_functions.braket2float(s)[source]
ocelot.routines.disparser_functions.find_connected_psites(psites: [<class 'pymatgen.core.sites.PeriodicSite'>])[source]

simplified version of percolation in unwrap

ocelot.routines.disparser_functions.get_connected_pblock(pblocks, cutoff=4.0)[source]
ocelot.routines.disparser_functions.get_pmg_dict(cifstring: str)[source]

use pmg dict to parse cifstring, only deal with one structure per file

Parameters

cifstring

Returns

ocelot.routines.disparser_functions.get_symmop(data)[source]

ocelot.routines.fileop module

ocelot.routines.fileop.check_allgausslog(howmanylog)[source]

check if certain amount of gaussian job finished

Parameters

howmanylog

Returns

ocelot.routines.fileop.check_gausslog(filename)[source]

check if gaussian log normally term

Parameters

filename

Returns

ocelot.routines.fileop.check_outcar(filename='OUTCAR')[source]

check if OUTCAR normally term

Parameters

filename

Returns

ocelot.routines.fileop.check_path(pathlist)[source]

check if the paths in the pathlist are all valid

Parameters

pathlist

Returns

ocelot.routines.fileop.copyfile(what, where)[source]

shutil operation to copy

Parameters
  • what

  • where

Returns

ocelot.routines.fileop.createdir(directory)[source]

mkdir

Parameters

directory

Returns

ocelot.routines.fileop.intkey(d: dict)[source]
ocelot.routines.fileop.lsexternsion(path, extension='cif')[source]

get a list of file names with certain extension in the path

Parameters
  • path

  • extension

Returns

[path/filename1.extension, …]

ocelot.routines.fileop.movefile(what, where)[source]

shutil operation to move

Parameters
  • what

  • where

Returns

ocelot.routines.fileop.nonblank_lines(f)[source]

get nonblank lines in a file

Parameters

f – string, file name

Returns

ocelot.routines.fileop.read_obj(fname)[source]

read an object from pickle

Parameters

fname

Returns

ocelot.routines.fileop.removefile(what)[source]
ocelot.routines.fileop.retrieve_name(var)[source]

Gets the name of var. Does it from the out most frame inner-wards. https://stackoverflow.com/questions/18425225/getting-the-name-of-a-variable-as-a-string

Parameters

var – variable to get name from.

Returns

string

ocelot.routines.fileop.stringkey(d: dict)[source]
ocelot.routines.fileop.write_obj(obj, fname)[source]

write an object into binary with pickle

Parameters
  • obj

  • fname

Returns

ocelot.routines.geometry module

class ocelot.routines.geometry.Fitter[source]

Bases: object

default for 3d

https://www.ltu.se/cms_fs/1.51590!/svd-fitting.pdf

https://stackoverflow.com/questions/2298390/fitting-a-line-in-3d

https://math.stackexchange.com/questions/2378198/computing-least-squares-error-from-plane-fitting-svd

https://stackoverflow.com/questions/12299540/plane-fitting-to-4-or-more-xyz-points

static iscollinear(pts, tol=0.001)[source]
static linear_fit(pts)[source]

vector * t + ptsmean = pt on fitted line

Parameters

pts – a (n, 3) array

Returns

vector, ptsmean, error

static plane_fit(pts)[source]

this only returns a normal, we do not have a plane equation here

Parameters

pts – a (n, 3) array

Returns

normal, ptsmean, error

exception ocelot.routines.geometry.GeometryError[source]

Bases: Exception

ocelot.routines.geometry.abcabg2pbc(abcabg)[source]
ocelot.routines.geometry.alpha_shape(points, alpha=0.7)[source]

0.7 seems work for adt, see tests usually the problem seems to be there’re just too few points to get enough triangles, if we can add more points inside the rings we can get smaller concave hull

https://gist.github.com/dwyerk/10561690

Compute the alpha shape (concave hull) of a set of 2D points.

Parameters
  • points – Iterable container of points.

  • alpha – alpha value to influence the gooeyness of the border. Smaller numbers don’t fall inward as much as larger numbers. Too large, and you lose everything!

ocelot.routines.geometry.angle_btw(v1, v2, output='radian')[source]

get angle between two vectors

Parameters
  • v1

  • v2

  • output

Returns

ocelot.routines.geometry.arbitrary_noraml(v)[source]
ocelot.routines.geometry.are_collinear(v1, v2)[source]
ocelot.routines.geometry.cart2frac(coords, pbc)[source]

convert to frac

Parameters
  • coords

  • pbc – a 3x3 mat, [a, b, c]

Returns

ocelot.routines.geometry.cart2polar(coords)[source]
ocelot.routines.geometry.coord_reverse(p, q, o, coords)[source]
ocelot.routines.geometry.coord_transform(p, q, o, coords)[source]

get the new coord under kart system defined by p, q, o as mutual ort unit vectors

Parameters
  • p

  • q

  • o

  • coords

Returns

ocelot.routines.geometry.dist_pt2line(p, q, r)[source]

https://stackoverflow.com/questions/50727961/

Parameters
  • p – end of segment

  • q – end of segment

  • r – the point

Returns

ocelot.routines.geometry.frac2cart(coords, pbc)[source]
ocelot.routines.geometry.genlinepts(a, b, stepsize)[source]
ocelot.routines.geometry.get_plane_param(normal, pt)[source]

ax + by + cz + d = 0

Parameters
  • normal

  • pt

Returns

ocelot.routines.geometry.get_proj_point2plane(pt, normal, ptonplane)[source]
Parameters
  • pt – input point coords

  • normal – plane normal

  • ptonplane – a point that is on the plane, a part of plane definition

Returns

the projection of input point on the plane

ocelot.routines.geometry.norm(v)[source]
ocelot.routines.geometry.rotate_along_axis(v, axis, theta, thetaunit='degree')[source]
Parameters
  • v

  • axis

  • theta

  • thetaunit

Returns

rotate v along axis counterclockwise

ocelot.routines.geometry.rotation_matrix(axis, theta, thetaunit='degree')[source]

Return the rotation matrix associated with counterclockwise rotation about the given axis by theta.

np.dot(rotation_matrix(axis,theta_d), v)

https://stackoverflow.com/questions/6802577/rotation-of-3d-vector

Parameters
  • axis – a list of 3 floats

  • theta

  • thetaunit

ocelot.routines.geometry.unify(v)[source]

ocelot.routines.loop module

as we are using networkx this can be deprecated…

class ocelot.routines.loop.Loopsearcher(nbmap)[source]

Bases: object

__init__(nbmap)[source]

giving a connection table, find all possible loops with a certain loop size

nbmap[i] does not contain i itself

useful to consider smallest set of smallest rings (sssr) problem

check 10.1073pnas.0813040106 for a better solution

Parameters

nbmap – connection table

alex_method(loopsize)[source]

I figured this out but I’m sure I’m not the first

Parameters

loopsize (int) – ring size to look for

Returns

a list of index

expand(path_set)[source]

the path will never intersect itself

generate_edges()[source]
static loop2edges(loop)[source]

notice here loop is sth like [1, 2, 3], where it is implied that 1 is connected to 3

Parameters

loop

Returns

e.g. [{1, 2}, {2, 3}, {1, 3}]

sssr_alex(size_min, size_max)[source]

ocelot.routines.mathop module

ocelot.routines.mathop.ev_to_ha(eigen_list)[source]

convert eV to Hartree

Parameters

eigen_list

Returns

ocelot.routines.mathop.fd_reci_2ndder(y, x, x0_index, step=1, accuracy='high')[source]

https://en.wikipedia.org/wiki/Finite_difference_coefficient x should be uniformly sampled over a seg

Parameters
  • y

  • x – length should be = (order+2)

  • x0_index – index of x0 at which derivative is calculated

  • step – step size based on index

  • accuracy

Returns

1/y’’

ocelot.routines.mathop.ra_to_rb(ra_list)[source]

convert reciprocal AA to reciprocal Born

Parameters

ra_list – a list of floats in 1/AA

Returns

ocelot.routines.mopac module

class ocelot.routines.mopac.MopacInput(sites, mopheader='CHARGES', comment_line='')[source]

Bases: object

__init__(sites, mopheader='CHARGES', comment_line='')[source]

does not support selective dynamics

Parameters
  • sites

  • mopheader

  • comment_line

classmethod from_mopinput(mopfn)[source]
static sites2mopinput(sites: [<class 'pymatgen.core.sites.Site'>], header, comment)[source]

return the string of mopac input

Parameters
  • sites

  • header

  • comment

Returns

write_mopinput(mopfn)[source]
class ocelot.routines.mopac.MopacOutput(fstring, caltype)[source]

Bases: object

__init__(fstring, caltype)[source]

Initialize self. See help(type(self)) for accurate signature.

parse_charges()[source]
parse_rlx_or_sp()[source]
parse_thermo()[source]
ocelot.routines.mopac.mop2siteobjs(mopfn)[source]

ocelot.routines.pbc module

ocelot.routines.pbc.AtomicRadius(site: pymatgen.core.sites.Site)[source]
class ocelot.routines.pbc.PBCparser[source]

Bases: object

static get_dist_and_trans(lattice: pymatgen.core.lattice.Lattice, fc1, fc2)[source]

get the shortest distance and corresponding translation vector between two frac coords

Parameters
  • lattice – pmg lattic obj

  • fc1

  • fc2

Returns

static unwrap(pstructure: pymatgen.core.structure.Structure)[source]

unwrap the structure, extract isolated mols

this will modify psites in-place, properties will be inherited and a new property ‘imol’ will be written psite with imol=x is an element of both mols[x] and unwrap_pblock_list[x]

this method is not supposed to modify siteid!

Parameters

pstructure – periodic structure obj from pymatgen

Returns

mols, unwrap_str_sorted, unwrap_pblock_list

static unwrap_and_squeeze(pstructure: pymatgen.core.structure.Structure)[source]

after unwrapping, the mols can be far away from each other, this tries to translate them s.t. they stay together

Parameters

pstructure

Returns

Module contents

here are the reusable, schema-free routines