Skip to content

Python API

Represents a molecular graph structure.

Provides functionality to read molecular structure data from XYZ files, analyze molecular geometry, and visualize the molecular structure using Plotly.

Parameters:

Name Type Description Default
atoms List[Atom]

List of Atom objects representing the molecular structure. Defaults to an empty list.

list()
bonds List[Bond]

List of Bond objects representing molecular connectivity. Defaults to an empty list.

list()
comment str

Comment or description of the molecular structure. Defaults to an empty string.

''
default_radii Dict[str, float]

Dictionary mapping element symbols to their default atomic radii. Defaults to internal _DEFAULT_RADII.

lambda: dict(_DEFAULT_RADII)()
cpk_colors Dict[str, str]

Dictionary mapping element symbols to their CPK color scheme colors. Defaults to internal _DEFAULT_CPK_COLORS.

lambda: dict(_DEFAULT_CPK_COLORS)()
cpk_color_rest str

Default color for elements not specified in cpk_colors. Defaults to "pink".

'pink'

Attributes:

Name Type Description
elements List[str]

List of element symbols in order of appearance.

indices List[int]

List of atom indices in order.

x List[float]

List of x coordinates for all atoms in order.

y List[float]

List of y coordinates for all atoms in order.

z List[float]

List of z coordinates for all atoms in order.

xyz NDArray[float64]

Nx3 array of atomic coordinates.

atomic_radii List[float]

List of atomic radii in atom order.

bond_lengths Dict[FrozenSet[int], float]

Mapping of atom index pairs to bond lengths.

adj_list Dict[int, Set[int]]

Dictionary mapping atom indices to sets of bonded atom indices.

adj_matrix NDArray[int_]

Square matrix representing molecular connectivity.

Source code in xyz2graph/graph.py
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
@dataclass
class MolGraph:
    """Represents a molecular graph structure.

    Provides functionality to read molecular structure data from XYZ files, analyze molecular
    geometry, and visualize the molecular structure using Plotly.

    Args:
        atoms (List[Atom], optional): List of Atom objects representing the molecular structure.
            Defaults to an empty list.
        bonds (List[Bond], optional): List of Bond objects representing molecular connectivity.
            Defaults to an empty list.
        comment (str, optional): Comment or description of the molecular structure.
            Defaults to an empty string.
        default_radii (Dict[str, float], optional): Dictionary mapping element symbols to their
            default atomic radii. Defaults to internal _DEFAULT_RADII.
        cpk_colors (Dict[str, str], optional): Dictionary mapping element symbols to their
            CPK color scheme colors. Defaults to internal _DEFAULT_CPK_COLORS.
        cpk_color_rest (str, optional): Default color for elements not specified in cpk_colors.
            Defaults to "pink".

    Attributes:
        elements (List[str]): List of element symbols in order of appearance.
        indices (List[int]): List of atom indices in order.
        x (List[float]): List of x coordinates for all atoms in order.
        y (List[float]): List of y coordinates for all atoms in order.
        z (List[float]): List of z coordinates for all atoms in order.
        xyz (NDArray[np.float64]): Nx3 array of atomic coordinates.
        atomic_radii (List[float]): List of atomic radii in atom order.
        bond_lengths (Dict[FrozenSet[int], float]): Mapping of atom index pairs to bond lengths.
        adj_list (Dict[int, Set[int]]): Dictionary mapping atom indices
            to sets of bonded atom indices.
        adj_matrix (NDArray[np.int_]): Square matrix representing molecular connectivity.
    """

    atoms: List[Atom] = field(default_factory=list)
    bonds: List[Bond] = field(default_factory=list)
    comment: str = field(default="")

    # Customizable parameters with defaults
    default_radii: Dict[str, float] = field(default_factory=lambda: dict(_DEFAULT_RADII))
    cpk_colors: Dict[str, str] = field(default_factory=lambda: dict(_DEFAULT_CPK_COLORS))
    cpk_color_rest: str = field(default="pink")

    @property
    def elements(self) -> List[str]:
        """Get list of elements in the molecule.

        Returns:
            List[str]: List of element symbols in order of appearance.
        """
        return [atom.element for atom in self.atoms]

    @property
    def indices(self) -> List[int]:
        """Get list of atom indices.

        Returns:
            List[int]: List of atom indices in order.
        """
        return [atom.index for atom in self.atoms]

    @property
    def x(self) -> List[float]:
        """Get x coordinates of all atoms.

        Returns:
            List[float]: List of x coordinates in atom order.
        """
        return [atom.x for atom in self.atoms]

    @property
    def y(self) -> List[float]:
        """Get y coordinates of all atoms.

        Returns:
            List[float]: List of y coordinates in atom order.
        """
        return [atom.y for atom in self.atoms]

    @property
    def z(self) -> List[float]:
        """Get z coordinates of all atoms.

        Returns:
            List[float]: List of z coordinates in atom order.
        """
        return [atom.z for atom in self.atoms]

    @property
    def xyz(self) -> NDArray[np.float64]:
        """Get atomic coordinates as a numpy array.

        Returns:
            NDArray[np.float64]: Nx3 array of atomic coordinates.
        """
        return np.array([[atom.x, atom.y, atom.z] for atom in self.atoms])

    @property
    def atomic_radii(self) -> List[float]:
        """Get atomic radii for all atoms.

        Returns:
            List[float]: List of atomic radii in atom order.
        """
        return [atom.radius for atom in self.atoms]

    @property
    def bond_lengths(self) -> Dict[FrozenSet[int], float]:
        """Get dictionary of bond lengths.

        Returns:
            Dict[FrozenSet[int], float]: Mapping of atom index pairs to bond lengths.
        """
        return {bond.indices: bond.length for bond in self.bonds}

    @property
    def adj_list(self) -> Dict[int, Set[int]]:
        """Get adjacency list representation of molecular graph.

        Returns:
            Dict[int, Set[int]]: Dictionary mapping atom indices to sets of bonded atom indices.
        """
        adj: Dict[int, Set[int]] = {}
        for bond in self.bonds:
            i1, i2 = tuple(bond.indices)
            adj.setdefault(i1, set()).add(i2)
            adj.setdefault(i2, set()).add(i1)
        return adj

    @property
    def adj_matrix(self) -> NDArray[np.int_]:
        """Get adjacency matrix representation of molecular graph.

        Returns:
            NDArray[np.int_]: A square matrix where entry (i,j) is 1 if atoms i and j
                are bonded and 0 otherwise.
        """
        n = len(self.atoms)
        matrix = np.zeros((n, n), dtype=np.int_)
        for bond in self.bonds:
            i, j = tuple(bond.indices)
            matrix[i, j] = matrix[j, i] = 1
        return matrix

    def from_string(self, xyz_string: str) -> None:
        """Parse molecular structure from string content in XYZ format.

        The xyz_string must follow XYZ file format conventions:
        - First line: Integer specifying the number of atoms
        - Second line: Comment line (can be empty)
        - Remaining lines: Atom coordinates in the format "element x y z"

        Args:
            xyz_string (str): String content in XYZ format, must include the number of
                            atoms on the first line and a comment/empty line on the second.

        Raises:
            ValueError: If string format is invalid or contains unknown elements.
        """
        try:
            lines = xyz_string.splitlines()

            if len(lines) < 3:
                raise ValueError(
                    "XYZ file must contain at least 3 lines "
                    "(number of atoms, comment, and coordinates)"
                )
            try:
                n_atoms = int(lines[0])
            except (IndexError, ValueError) as err:
                raise ValueError("First line must be an integer (number of atoms)") from err

            self.comment = lines[1].strip()

            # Process coordinate lines
            coordinate_lines = list(filter(None, map(str.strip, lines[2:])))

            self.atoms = []
            for i, line in enumerate(coordinate_lines, start=0):
                parts = line.split()

                try:
                    element = parts[0]
                    if element not in self.default_radii:
                        raise ValueError(f"Unknown element symbol: {element}")

                    x, y, z = map(float, parts[1:4])

                    self.atoms.append(
                        Atom(
                            element=element,
                            x=x,
                            y=y,
                            z=z,
                            index=i,
                            radius=self.default_radii[element],
                        )
                    )
                except (IndexError, ValueError) as err:
                    raise ValueError(
                        f"Invalid format in line {i+3}, expected: element x y z"
                    ) from err

            if len(self.atoms) != n_atoms:
                logger.warning(
                    f"Number of atoms in file ({len(self.atoms)}) doesn't match the number "
                    f"specified in the first line ({n_atoms})"
                )

            self._generate_bonds()

        except Exception as e:
            logger.error(f"Error reading XYZ file: {e}")
            raise

    def read_xyz(self, file_path: Union[str, Path]) -> None:
        """Read molecular structure from XYZ file.

        Args:
            file_path (Union[str, Path]): Path to XYZ format file.

        Raises:
            FileNotFoundError: If the specified file does not exist.
            ValueError: If file format is invalid or contains unknown elements.
        """
        file_path = Path(file_path)
        if not file_path.exists():
            raise FileNotFoundError(f"XYZ file not found: {file_path}")

        try:
            xyz_content = file_path.read_text()
            self.from_string(xyz_content)
        except Exception as e:
            logger.error(f"Error reading XYZ file: {e}")
            raise

    def to_string(self) -> str:
        """Generate XYZ formatted string for the molecular structure.

        Returns:
            str: XYZ formatted string.
        """
        xyz_content = f"{len(self.atoms)}\n"
        xyz_content += f"{self.comment}\n"

        for atom in self.atoms:
            xyz_content += f"{atom.element} {atom.x:.6f} {atom.y:.6f} {atom.z:.6f}\n"

        return xyz_content

    def write_xyz(self, file_path: Union[str, Path]) -> None:
        """Write molecular structure to an XYZ file.

        Args:
            file_path (Union[str, Path]): Path to the XYZ file to write.
        """
        try:
            file_path = Path(file_path)
            xyz_content = self.to_string()
            file_path.write_text(xyz_content)
        except OSError as e:
            print(f"Error writing to file: {e}")

    def _generate_bonds(self) -> None:
        """Generate bonds based on atomic distances.

        Updates the bonds list based on distance criteria between atoms.
        """
        self.bonds.clear()

        distances = self.distance_matrix()

        # Calculate bonding threshold matrix
        radii = np.array(self.atomic_radii)
        thresholds = (radii[:, np.newaxis] + radii) * 1.3

        # Find bonded atoms (creates adjacency matrix internally)
        adj_matrix = np.logical_and(0.1 < distances, thresholds > distances).astype(np.int_)

        # Create Bond objects from adjacency matrix
        for i, j in zip(*np.nonzero(adj_matrix)):
            if i < j:
                self.bonds.append(Bond(self.atoms[i], self.atoms[j]))

    def distance_matrix(self) -> NDArray[np.float64]:
        """Calculates the matrix of interatomic distances using optimized memory handling.

        Attempts to use fast vectorized calculation first, then falls back to a memory-efficient
        loop-based method if memory constraints are encountered.

        Returns:
            NDArray[np.float64]: A square matrix where entry (i,j) represents the
                Euclidean distance between atoms i and j in Angstroms.
        """
        try:
            distances = self.xyz[:, np.newaxis, :] - self.xyz
            return np.sqrt(np.einsum("ijk,ijk->ij", distances, distances))
        except MemoryError:
            # Fall back to loop-based method
            n_atoms = len(self.atoms)
            logger.info("Using memory-efficient method for distance calculation")
            distance_matrix = np.zeros((n_atoms, n_atoms), dtype=np.float64)
            for i in range(n_atoms):
                diff = self.xyz[i] - self.xyz
                distance_matrix[i] = np.sqrt(np.sum(diff * diff, axis=1))
            return distance_matrix

    def formula(self) -> str:
        """Generate molecular formula in Hill notation.

        Returns:
            str: Molecular formula in Hill notation.
        """
        if not self.atoms:
            return ""

        element_counts = Counter(atom.element for atom in self.atoms)
        formula_parts = []

        # Carbon and Hydrogen first
        if "C" in element_counts:
            count = element_counts.pop("C")
            formula_parts.append(f"C{count if count > 1 else ''}")

            if "H" in element_counts:
                count = element_counts.pop("H")
                formula_parts.append(f"H{count if count > 1 else ''}")

        # Add remaining elements alphabetically
        for element in sorted(element_counts):
            count = element_counts[element]
            formula_parts.append(f"{element}{count if count > 1 else ''}")

        return "".join(formula_parts)

    def set_element_radius(self, element: str, radius: float) -> None:
        """Set the reference radius for a specific element.

        Args:
            element (str): Chemical element symbol.
            radius (float): New atomic radius value.
        """
        self.default_radii[element] = radius
        # Update radii for existing atoms of this element
        for atom in self.atoms:
            if atom.element == element:
                atom.radius = radius
        # Regenerate bonds with new radii
        self._generate_bonds()

    def remove(
        self,
        indices: Optional[List[int]] = None,
        elements: Optional[List[str]] = None,
        inplace: bool = False,
    ) -> Optional["MolGraph"]:
        """Remove atoms by indices and/or elements.

        Args:
            indices (List[int], optional): List of atom indices to remove.
            elements (List[str], optional): List of element symbols to remove.
            inplace (bool): If True, modify this instance. If False, return a new instance.

        Returns:
            Optional[MolGraph]: New MolGraph instance if inplace=False, None if inplace=True.

        Raises:
            IndexError: If any index is out of range.
            ValueError: If attempting to remove all atoms or if unknown elements specified.
        """
        mask = [True] * len(self.atoms)

        if indices is not None:
            if any(i < 0 or i >= len(self.atoms) for i in indices):
                raise IndexError("Atom index out of range")
            for idx in indices:
                mask[idx] = False

        if elements is not None:
            found_elements = {atom.element for atom in self.atoms if atom.element in elements}
            unused_elements = set(elements) - found_elements
            if unused_elements:
                logger.warning(
                    f"Element(s) not found: {', '.join(sorted(unused_elements))}. "
                    "Use proper case (e.g., 'H' not 'h')"
                )
            mask = [m and atom.element not in elements for m, atom in zip(mask, self.atoms)]

        if not any(mask):
            raise ValueError("Cannot remove all atoms from molecule")

        filtered_atoms = list(compress(self.atoms, mask))

        if inplace:
            self.atoms = filtered_atoms
            self._generate_bonds()
            return None

        new_mol = MolGraph()
        new_mol.atoms = filtered_atoms
        new_mol.comment = self.comment
        new_mol.default_radii = self.default_radii.copy()
        new_mol.cpk_colors = self.cpk_colors.copy()
        new_mol.cpk_color_rest = self.cpk_color_rest
        new_mol._generate_bonds()
        return new_mol

    def to_networkx(self) -> nx.Graph:
        """Convert molecular structure to NetworkX graph.

        Creates a NetworkX graph representation of the molecular structure where atoms are nodes
        and bonds are edges. Each node (atom) index corresponds to the atom's original index in
        the molecular structure and contains element type and 3D coordinates as attributes. Each
        edge (bond) connects two atoms using their indices and stores the bond length.

        Node attributes:
            - element (str): Chemical element symbol of the atom
            - xyz (tuple): 3D coordinates of the atom as (x, y, z) floats

        Edge attributes:
            - length (float): Bond length between the two connected atoms in Angstroms

        Returns:
            nx.Graph: NetworkX undirected graph where:
                - Nodes are atoms with their indices as node identifiers
                - Edges are bonds between atoms
                - Node and edge attributes contain chemical properties
        """
        logger.debug("Creating NetworkX graph")

        G = nx.Graph()

        # Add nodes with attributes
        for atom in self.atoms:
            G.add_node(atom.index, element=atom.element, xyz=(atom.x, atom.y, atom.z))

        # Add edges with attributes
        for bond in self.bonds:
            G.add_edge(bond.atom1.index, bond.atom2.index, length=bond.length)

        return G

    def to_plotly(self, config: Optional[VisualizationConfig] = None) -> go.Figure:
        """Convert molecular structure to Plotly figure.

        Args:
            config (VisualizationConfig, optional): Visualization configuration parameters.

        Returns:
            go.Figure: Interactive molecular visualization as Plotly figure.
        """
        logger.debug("Creating Plotly figure")

        return create_visualization(self, config)

    def __len__(self) -> int:
        """Get the number of atoms in the molecule.

        Returns:
            int: Total number of atoms.
        """
        return len(self.atoms)

    def __getitem__(self, index: int) -> Atom:
        """Get atom at specified index.

        Args:
            index (int): Zero-based index of the atom.

        Returns:
            Atom: Atom object at the specified index.

        Raises:
            IndexError: If index is out of range.
        """
        return self.atoms[index]

    def __repr__(self) -> str:
        """Creates a simplified string representation of the molecular graph.

        Generates a string representation showing the molecular formula in Hill notation
        along with basic structural information about the number of atoms and bonds.

        Returns:
            str: A string in the format "MolGraph(formula: num_atoms atoms, num_bonds bonds)"
                or "MolGraph(empty)" for an empty molecule.
        """
        if not self.atoms:
            return "MolGraph(empty)"

        return f"MolGraph({self.formula()}: {len(self.atoms)} atoms, {len(self.bonds)} bonds)"

elements property

elements: List[str]

Get list of elements in the molecule.

Returns:

Type Description
List[str]

List[str]: List of element symbols in order of appearance.

indices property

indices: List[int]

Get list of atom indices.

Returns:

Type Description
List[int]

List[int]: List of atom indices in order.

x property

x: List[float]

Get x coordinates of all atoms.

Returns:

Type Description
List[float]

List[float]: List of x coordinates in atom order.

y property

y: List[float]

Get y coordinates of all atoms.

Returns:

Type Description
List[float]

List[float]: List of y coordinates in atom order.

z property

z: List[float]

Get z coordinates of all atoms.

Returns:

Type Description
List[float]

List[float]: List of z coordinates in atom order.

xyz property

xyz: NDArray[float64]

Get atomic coordinates as a numpy array.

Returns:

Type Description
NDArray[float64]

NDArray[np.float64]: Nx3 array of atomic coordinates.

atomic_radii property

atomic_radii: List[float]

Get atomic radii for all atoms.

Returns:

Type Description
List[float]

List[float]: List of atomic radii in atom order.

bond_lengths property

bond_lengths: Dict[FrozenSet[int], float]

Get dictionary of bond lengths.

Returns:

Type Description
Dict[FrozenSet[int], float]

Dict[FrozenSet[int], float]: Mapping of atom index pairs to bond lengths.

adj_list property

adj_list: Dict[int, Set[int]]

Get adjacency list representation of molecular graph.

Returns:

Type Description
Dict[int, Set[int]]

Dict[int, Set[int]]: Dictionary mapping atom indices to sets of bonded atom indices.

adj_matrix property

adj_matrix: NDArray[int_]

Get adjacency matrix representation of molecular graph.

Returns:

Type Description
NDArray[int_]

NDArray[np.int_]: A square matrix where entry (i,j) is 1 if atoms i and j are bonded and 0 otherwise.

from_string

from_string(xyz_string: str) -> None

Parse molecular structure from string content in XYZ format.

The xyz_string must follow XYZ file format conventions: - First line: Integer specifying the number of atoms - Second line: Comment line (can be empty) - Remaining lines: Atom coordinates in the format "element x y z"

Parameters:

Name Type Description Default
xyz_string str

String content in XYZ format, must include the number of atoms on the first line and a comment/empty line on the second.

required

Raises:

Type Description
ValueError

If string format is invalid or contains unknown elements.

Source code in xyz2graph/graph.py
def from_string(self, xyz_string: str) -> None:
    """Parse molecular structure from string content in XYZ format.

    The xyz_string must follow XYZ file format conventions:
    - First line: Integer specifying the number of atoms
    - Second line: Comment line (can be empty)
    - Remaining lines: Atom coordinates in the format "element x y z"

    Args:
        xyz_string (str): String content in XYZ format, must include the number of
                        atoms on the first line and a comment/empty line on the second.

    Raises:
        ValueError: If string format is invalid or contains unknown elements.
    """
    try:
        lines = xyz_string.splitlines()

        if len(lines) < 3:
            raise ValueError(
                "XYZ file must contain at least 3 lines "
                "(number of atoms, comment, and coordinates)"
            )
        try:
            n_atoms = int(lines[0])
        except (IndexError, ValueError) as err:
            raise ValueError("First line must be an integer (number of atoms)") from err

        self.comment = lines[1].strip()

        # Process coordinate lines
        coordinate_lines = list(filter(None, map(str.strip, lines[2:])))

        self.atoms = []
        for i, line in enumerate(coordinate_lines, start=0):
            parts = line.split()

            try:
                element = parts[0]
                if element not in self.default_radii:
                    raise ValueError(f"Unknown element symbol: {element}")

                x, y, z = map(float, parts[1:4])

                self.atoms.append(
                    Atom(
                        element=element,
                        x=x,
                        y=y,
                        z=z,
                        index=i,
                        radius=self.default_radii[element],
                    )
                )
            except (IndexError, ValueError) as err:
                raise ValueError(
                    f"Invalid format in line {i+3}, expected: element x y z"
                ) from err

        if len(self.atoms) != n_atoms:
            logger.warning(
                f"Number of atoms in file ({len(self.atoms)}) doesn't match the number "
                f"specified in the first line ({n_atoms})"
            )

        self._generate_bonds()

    except Exception as e:
        logger.error(f"Error reading XYZ file: {e}")
        raise

read_xyz

read_xyz(file_path: Union[str, Path]) -> None

Read molecular structure from XYZ file.

Parameters:

Name Type Description Default
file_path Union[str, Path]

Path to XYZ format file.

required

Raises:

Type Description
FileNotFoundError

If the specified file does not exist.

ValueError

If file format is invalid or contains unknown elements.

Source code in xyz2graph/graph.py
def read_xyz(self, file_path: Union[str, Path]) -> None:
    """Read molecular structure from XYZ file.

    Args:
        file_path (Union[str, Path]): Path to XYZ format file.

    Raises:
        FileNotFoundError: If the specified file does not exist.
        ValueError: If file format is invalid or contains unknown elements.
    """
    file_path = Path(file_path)
    if not file_path.exists():
        raise FileNotFoundError(f"XYZ file not found: {file_path}")

    try:
        xyz_content = file_path.read_text()
        self.from_string(xyz_content)
    except Exception as e:
        logger.error(f"Error reading XYZ file: {e}")
        raise

to_string

to_string() -> str

Generate XYZ formatted string for the molecular structure.

Returns:

Name Type Description
str str

XYZ formatted string.

Source code in xyz2graph/graph.py
def to_string(self) -> str:
    """Generate XYZ formatted string for the molecular structure.

    Returns:
        str: XYZ formatted string.
    """
    xyz_content = f"{len(self.atoms)}\n"
    xyz_content += f"{self.comment}\n"

    for atom in self.atoms:
        xyz_content += f"{atom.element} {atom.x:.6f} {atom.y:.6f} {atom.z:.6f}\n"

    return xyz_content

write_xyz

write_xyz(file_path: Union[str, Path]) -> None

Write molecular structure to an XYZ file.

Parameters:

Name Type Description Default
file_path Union[str, Path]

Path to the XYZ file to write.

required
Source code in xyz2graph/graph.py
def write_xyz(self, file_path: Union[str, Path]) -> None:
    """Write molecular structure to an XYZ file.

    Args:
        file_path (Union[str, Path]): Path to the XYZ file to write.
    """
    try:
        file_path = Path(file_path)
        xyz_content = self.to_string()
        file_path.write_text(xyz_content)
    except OSError as e:
        print(f"Error writing to file: {e}")

distance_matrix

distance_matrix() -> NDArray[np.float64]

Calculates the matrix of interatomic distances using optimized memory handling.

Attempts to use fast vectorized calculation first, then falls back to a memory-efficient loop-based method if memory constraints are encountered.

Returns:

Type Description
NDArray[float64]

NDArray[np.float64]: A square matrix where entry (i,j) represents the Euclidean distance between atoms i and j in Angstroms.

Source code in xyz2graph/graph.py
def distance_matrix(self) -> NDArray[np.float64]:
    """Calculates the matrix of interatomic distances using optimized memory handling.

    Attempts to use fast vectorized calculation first, then falls back to a memory-efficient
    loop-based method if memory constraints are encountered.

    Returns:
        NDArray[np.float64]: A square matrix where entry (i,j) represents the
            Euclidean distance between atoms i and j in Angstroms.
    """
    try:
        distances = self.xyz[:, np.newaxis, :] - self.xyz
        return np.sqrt(np.einsum("ijk,ijk->ij", distances, distances))
    except MemoryError:
        # Fall back to loop-based method
        n_atoms = len(self.atoms)
        logger.info("Using memory-efficient method for distance calculation")
        distance_matrix = np.zeros((n_atoms, n_atoms), dtype=np.float64)
        for i in range(n_atoms):
            diff = self.xyz[i] - self.xyz
            distance_matrix[i] = np.sqrt(np.sum(diff * diff, axis=1))
        return distance_matrix

formula

formula() -> str

Generate molecular formula in Hill notation.

Returns:

Name Type Description
str str

Molecular formula in Hill notation.

Source code in xyz2graph/graph.py
def formula(self) -> str:
    """Generate molecular formula in Hill notation.

    Returns:
        str: Molecular formula in Hill notation.
    """
    if not self.atoms:
        return ""

    element_counts = Counter(atom.element for atom in self.atoms)
    formula_parts = []

    # Carbon and Hydrogen first
    if "C" in element_counts:
        count = element_counts.pop("C")
        formula_parts.append(f"C{count if count > 1 else ''}")

        if "H" in element_counts:
            count = element_counts.pop("H")
            formula_parts.append(f"H{count if count > 1 else ''}")

    # Add remaining elements alphabetically
    for element in sorted(element_counts):
        count = element_counts[element]
        formula_parts.append(f"{element}{count if count > 1 else ''}")

    return "".join(formula_parts)

set_element_radius

set_element_radius(element: str, radius: float) -> None

Set the reference radius for a specific element.

Parameters:

Name Type Description Default
element str

Chemical element symbol.

required
radius float

New atomic radius value.

required
Source code in xyz2graph/graph.py
def set_element_radius(self, element: str, radius: float) -> None:
    """Set the reference radius for a specific element.

    Args:
        element (str): Chemical element symbol.
        radius (float): New atomic radius value.
    """
    self.default_radii[element] = radius
    # Update radii for existing atoms of this element
    for atom in self.atoms:
        if atom.element == element:
            atom.radius = radius
    # Regenerate bonds with new radii
    self._generate_bonds()

remove

remove(
    indices: Optional[List[int]] = None,
    elements: Optional[List[str]] = None,
    inplace: bool = False,
) -> Optional[MolGraph]

Remove atoms by indices and/or elements.

Parameters:

Name Type Description Default
indices List[int]

List of atom indices to remove.

None
elements List[str]

List of element symbols to remove.

None
inplace bool

If True, modify this instance. If False, return a new instance.

False

Returns:

Type Description
Optional[MolGraph]

Optional[MolGraph]: New MolGraph instance if inplace=False, None if inplace=True.

Raises:

Type Description
IndexError

If any index is out of range.

ValueError

If attempting to remove all atoms or if unknown elements specified.

Source code in xyz2graph/graph.py
def remove(
    self,
    indices: Optional[List[int]] = None,
    elements: Optional[List[str]] = None,
    inplace: bool = False,
) -> Optional["MolGraph"]:
    """Remove atoms by indices and/or elements.

    Args:
        indices (List[int], optional): List of atom indices to remove.
        elements (List[str], optional): List of element symbols to remove.
        inplace (bool): If True, modify this instance. If False, return a new instance.

    Returns:
        Optional[MolGraph]: New MolGraph instance if inplace=False, None if inplace=True.

    Raises:
        IndexError: If any index is out of range.
        ValueError: If attempting to remove all atoms or if unknown elements specified.
    """
    mask = [True] * len(self.atoms)

    if indices is not None:
        if any(i < 0 or i >= len(self.atoms) for i in indices):
            raise IndexError("Atom index out of range")
        for idx in indices:
            mask[idx] = False

    if elements is not None:
        found_elements = {atom.element for atom in self.atoms if atom.element in elements}
        unused_elements = set(elements) - found_elements
        if unused_elements:
            logger.warning(
                f"Element(s) not found: {', '.join(sorted(unused_elements))}. "
                "Use proper case (e.g., 'H' not 'h')"
            )
        mask = [m and atom.element not in elements for m, atom in zip(mask, self.atoms)]

    if not any(mask):
        raise ValueError("Cannot remove all atoms from molecule")

    filtered_atoms = list(compress(self.atoms, mask))

    if inplace:
        self.atoms = filtered_atoms
        self._generate_bonds()
        return None

    new_mol = MolGraph()
    new_mol.atoms = filtered_atoms
    new_mol.comment = self.comment
    new_mol.default_radii = self.default_radii.copy()
    new_mol.cpk_colors = self.cpk_colors.copy()
    new_mol.cpk_color_rest = self.cpk_color_rest
    new_mol._generate_bonds()
    return new_mol

to_networkx

to_networkx() -> nx.Graph

Convert molecular structure to NetworkX graph.

Creates a NetworkX graph representation of the molecular structure where atoms are nodes and bonds are edges. Each node (atom) index corresponds to the atom's original index in the molecular structure and contains element type and 3D coordinates as attributes. Each edge (bond) connects two atoms using their indices and stores the bond length.

Node attributes
  • element (str): Chemical element symbol of the atom
  • xyz (tuple): 3D coordinates of the atom as (x, y, z) floats
Edge attributes
  • length (float): Bond length between the two connected atoms in Angstroms

Returns:

Type Description
Graph

nx.Graph: NetworkX undirected graph where: - Nodes are atoms with their indices as node identifiers - Edges are bonds between atoms - Node and edge attributes contain chemical properties

Source code in xyz2graph/graph.py
def to_networkx(self) -> nx.Graph:
    """Convert molecular structure to NetworkX graph.

    Creates a NetworkX graph representation of the molecular structure where atoms are nodes
    and bonds are edges. Each node (atom) index corresponds to the atom's original index in
    the molecular structure and contains element type and 3D coordinates as attributes. Each
    edge (bond) connects two atoms using their indices and stores the bond length.

    Node attributes:
        - element (str): Chemical element symbol of the atom
        - xyz (tuple): 3D coordinates of the atom as (x, y, z) floats

    Edge attributes:
        - length (float): Bond length between the two connected atoms in Angstroms

    Returns:
        nx.Graph: NetworkX undirected graph where:
            - Nodes are atoms with their indices as node identifiers
            - Edges are bonds between atoms
            - Node and edge attributes contain chemical properties
    """
    logger.debug("Creating NetworkX graph")

    G = nx.Graph()

    # Add nodes with attributes
    for atom in self.atoms:
        G.add_node(atom.index, element=atom.element, xyz=(atom.x, atom.y, atom.z))

    # Add edges with attributes
    for bond in self.bonds:
        G.add_edge(bond.atom1.index, bond.atom2.index, length=bond.length)

    return G

to_plotly

to_plotly(
    config: Optional[VisualizationConfig] = None,
) -> go.Figure

Convert molecular structure to Plotly figure.

Parameters:

Name Type Description Default
config VisualizationConfig

Visualization configuration parameters.

None

Returns:

Type Description
Figure

go.Figure: Interactive molecular visualization as Plotly figure.

Source code in xyz2graph/graph.py
def to_plotly(self, config: Optional[VisualizationConfig] = None) -> go.Figure:
    """Convert molecular structure to Plotly figure.

    Args:
        config (VisualizationConfig, optional): Visualization configuration parameters.

    Returns:
        go.Figure: Interactive molecular visualization as Plotly figure.
    """
    logger.debug("Creating Plotly figure")

    return create_visualization(self, config)