The PubChemR
package introduces a pivotal function,
get_pug_rest
, designed to facilitate seamless access to the
vast chemical data repository of PubChem. This function leverages the
capabilities of PubChem’s Power User Gateway (PUG) REST service,
providing a straightforward and efficient means for users to
programmatically interact with PubChem’s extensive database. This
vignette aims to elucidate the structure and usage of the PUG REST
service, offering a range of illustrative use cases to aid new users in
understanding its operation and constructing effective requests.
PUG REST, standing for Power User Gateway RESTful interface, is a simplified access route to PubChem’s data and services. It is designed for scripts, web page embedded JavaScript, and third-party applications, eliminating the need for the more complex XML and SOAP envelopes required by other PUG variants. PUG REST’s design revolves around the PubChem identifier (SID for substances, CID for compounds, and AID for assays) and is structured into three main request components: input (identifiers), operation (actions on identifiers), and output (desired information format).
Usage Policy:
get_pug_rest
Overview
The get_pug_rest
function in the PubChemR
package provides a versatile interface to access a wide range of
chemical data from the PubChem database. This section of the vignette
focuses on various methods to retrieve chemical structure information
and other related data using the PUG REST service. The function is
designed to be flexible, accommodating different input methods,
operations, and output formats.
This function sends a request to the PubChem PUG REST API to retrieve various types of data for a given identifier. It supports fetching data in different formats and allows saving the output.
get_pug_rest(
identifier = NULL,
namespace = "cid",
domain = "compound",
operation = NULL,
output = "JSON",
searchtype = NULL,
property = NULL,
options = NULL,
save = FALSE,
dpi = 300,
path = NULL,
file_name = NULL,
...
)
Arguments
identifier: A vector of identifiers for the query, either numeric or character. This is the main input for querying the PubChem database.
namespace: A character string specifying the namespace for the request. Default is ‘cid’. This defines the type of identifier being used, such as ‘cid’ (Compound ID), ‘sid’ (Substance ID), etc.
domain: A character string specifying the domain for the request. Default is ‘compound’. This indicates the type of entity being queried, such as ‘compound’, ‘substance’, etc.
operation: An optional character string specifying the operation for the request. This can be used to specify particular actions or methods within the API.
output: A character string specifying the output format. Possible values are ‘SDF’, ‘JSON’, ‘JSONP’, ‘CSV’, ‘TXT’, and ‘PNG’. Default is ‘JSON’. This defines the format in which the data will be returned.
searchtype: An optional character string specifying the search type. This allows for specifying the nature of the search being performed.
property: An optional character string specifying the property for the request. This can be used to filter or specify particular properties of the data being retrieved.
options: A list of additional options for the request. This allows for further customization and fine-tuning of the request parameters.
save: A logical value indicating whether to save the output as a file or image. Default is FALSE. When set to TRUE, the function will save the retrieved data to a specified file.
dpi: An integer specifying the DPI for image output. Default is 300. This is relevant when the output format is an image, determining the resolution of the saved image.
path: A character string specifying the directory path where the file will be saved. If not provided, the current working directory is used.
file_name: A character string of length 1. Defines the name of the file (without file extension) to save. If NULL, the default file name is set as “files_downloaded”.
…: Additional arguments to be passed to the request.
Value
The function returns different types of content based on the specified output format:
JSON: Returns a list. CSV and TXT: Returns a data frame. SDF: Returns an SDF file of the requested identifier. PNG: Returns an image object or saves an image file.
In the context of the PubChem PUG REST API, the input methods define how records of interest are specified for a request. There are several ways to define this input, with the most common methods being outlined below:
1. By Identifier: The most straightforward method to specify input is by using identifiers directly. These identifiers can be Substance IDs (SIDs) or Compound IDs (CIDs). For example, to retrieve the names of a substance with CID 2244, you can use the get_pug_rest function as follows:
result <- get_pug_rest(identifier = "2244", namespace = "cid", domain = "compound", output = "JSON")
result
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Compound
#> - Namespace: CID
#> - Operation: <NULL>
#> - Identifier: 2244
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
The pubChemData function then processes the result to extract and display the retrieved data. Here’s an interpretation of the output for CID 2244:
pubChemDataResult <- pubChemData(result)
The JSON response contains detailed information about the compound identified by CID 2244. The PC_Compounds array holds the compound data, and within it, each element corresponds to a specific compound.
For CID 2244, the following information is retrieved:
ID: Confirms the compound identifier is CID 2244.
pubChemDataResult$PC_Compounds[[1]]$id
#> $id
#> cid
#> 2244
Atoms: Details the atomic composition, with an aid array listing the atom IDs and an element array listing the atomic numbers (e.g., 6 for carbon, 8 for oxygen, 1 for hydrogen).
pubChemDataResult$PC_Compounds[[1]]$atoms
#> $aid
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
#>
#> $element
#> [1] 8 8 8 8 6 6 6 6 6 6 6 6 6 1 1 1 1 1 1 1 1
Bonds: Describes the bonds between atoms, including arrays for the IDs of the atoms involved (aid1 and aid2) and the bond order.
pubChemDataResult$PC_Compounds[[1]]$bonds
#> $aid1
#> [1] 1 1 2 2 3 4 5 5 6 6 7 7 8 8 9 9 10 12 13 13 13
#>
#> $aid2
#> [1] 5 12 11 21 11 12 6 7 8 11 9 14 10 15 10 16 17 13 18 19 20
#>
#> $order
#> [1] 1 1 1 1 2 2 1 2 2 1 1 1 1 1 2 1 1 1 1 1 1
Coordinates: Provides the spatial coordinates (x and y) for each atom, which can be used to visualize the molecular structure.
pubChemDataResult$PC_Compounds[[1]]$coords
#> [[1]]
#> [[1]]$type
#> [1] 1 5 255
#>
#> [[1]]$aid
#> [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
#>
#> [[1]]$conformers
#> [[1]]$conformers[[1]]
#> [[1]]$conformers[[1]]$x
#> [1] 3.7320 6.3301 4.5981 2.8660 4.5981 5.4641 4.5981 6.3301 5.4641 6.3301
#> [11] 5.4641 2.8660 2.0000 4.0611 6.8671 5.4641 6.8671 2.3100 1.4631 1.6900
#> [21] 6.3301
#>
#> [[1]]$conformers[[1]]$y
#> [1] -0.0600 1.4400 1.4400 -1.5600 -0.5600 -0.0600 -1.5600 -0.5600 -2.0600
#> [10] -1.5600 0.9400 -0.5600 -0.0600 -1.8700 -0.2500 -2.6800 -1.8700 0.4769
#> [19] 0.2500 -0.5969 2.0600
#>
#> [[1]]$conformers[[1]]$style
#> [[1]]$conformers[[1]]$style$annotation
#> [1] 8 8 8 8 8 8
#>
#> [[1]]$conformers[[1]]$style$aid1
#> [1] 5 5 6 7 8 9
#>
#> [[1]]$conformers[[1]]$style$aid2
#> [1] 6 7 8 9 10 10
Charge: Indicates the compound’s charge, which is 0 in this case.
pubChemDataResult$PC_Compounds[[1]]$charge
#> [1] 0
Properties: Lists various properties of the compound, including:
Compound Complexity: A measure of the molecular complexity.
pubChemDataResult$PC_Compounds[[1]]$props[[2]]
#> $urn
#> $urn$label
#> [1] "Compound Complexity"
#>
#> $urn$datatype
#> [1] 7
#>
#> $urn$implementation
#> [1] "E_COMPLEXITY"
#>
#> $urn$version
#> [1] "3.4.8.18"
#>
#> $urn$software
#> [1] "Cactvs"
#>
#> $urn$source
#> [1] "Xemistry GmbH"
#>
#> $urn$release
#> [1] "2021.10.14"
#>
#>
#> $value
#> fval
#> 212
Hydrogen Bond Acceptor/Donor Count: Indicates the number of hydrogen bond acceptors and donors.
pubChemDataResult$PC_Compounds[[1]]$props[[3]]
#> $urn
#> $urn$label
#> [1] "Count"
#>
#> $urn$name
#> [1] "Hydrogen Bond Acceptor"
#>
#> $urn$datatype
#> [1] 5
#>
#> $urn$implementation
#> [1] "E_NHACCEPTORS"
#>
#> $urn$version
#> [1] "3.4.8.18"
#>
#> $urn$software
#> [1] "Cactvs"
#>
#> $urn$source
#> [1] "Xemistry GmbH"
#>
#> $urn$release
#> [1] "2021.10.14"
#>
#>
#> $value
#> ival
#> 4
pubChemDataResult$PC_Compounds[[1]]$props[[4]]
#> $urn
#> $urn$label
#> [1] "Count"
#>
#> $urn$name
#> [1] "Hydrogen Bond Donor"
#>
#> $urn$datatype
#> [1] 5
#>
#> $urn$implementation
#> [1] "E_NHDONORS"
#>
#> $urn$version
#> [1] "3.4.8.18"
#>
#> $urn$software
#> [1] "Cactvs"
#>
#> $urn$source
#> [1] "Xemistry GmbH"
#>
#> $urn$release
#> [1] "2021.10.14"
#>
#>
#> $value
#> ival
#> 1
Rotatable Bond Count: The number of rotatable bonds, which impacts the molecule’s flexibility.
pubChemDataResult$PC_Compounds[[1]]$props[[5]]
#> $urn
#> $urn$label
#> [1] "Count"
#>
#> $urn$name
#> [1] "Rotatable Bond"
#>
#> $urn$datatype
#> [1] 5
#>
#> $urn$implementation
#> [1] "E_NROTBONDS"
#>
#> $urn$version
#> [1] "3.4.8.18"
#>
#> $urn$software
#> [1] "Cactvs"
#>
#> $urn$source
#> [1] "Xemistry GmbH"
#>
#> $urn$release
#> [1] "2021.10.14"
#>
#>
#> $value
#> ival
#> 3
IUPAC Names: Various standardized names for the compound, such as “2-acetoxybenzoic acid” and “2-acetyloxybenzoic acid”.
pubChemDataResult$PC_Compounds[[1]]$props[[7]]
#> $urn
#> $urn$label
#> [1] "IUPAC Name"
#>
#> $urn$name
#> [1] "Allowed"
#>
#> $urn$datatype
#> [1] 1
#>
#> $urn$version
#> [1] "2.7.0"
#>
#> $urn$software
#> [1] "Lexichem TK"
#>
#> $urn$source
#> [1] "OpenEye Scientific Software"
#>
#> $urn$release
#> [1] "2021.10.14"
#>
#>
#> $value
#> sval
#> "2-acetoxybenzoic acid"
InChI and InChIKey: Standardized identifiers for the chemical structure.
pubChemDataResult$PC_Compounds[[1]]$props[[13]]
#> $urn
#> $urn$label
#> [1] "InChI"
#>
#> $urn$name
#> [1] "Standard"
#>
#> $urn$datatype
#> [1] 1
#>
#> $urn$version
#> [1] "1.0.6"
#>
#> $urn$software
#> [1] "InChI"
#>
#> $urn$source
#> [1] "iupac.org"
#>
#> $urn$release
#> [1] "2021.10.14"
#>
#>
#> $value
#> sval
#> "InChI=1S/C9H8O4/c1-6(10)13-8-5-3-2-4-7(8)9(11)12/h2-5H,1H3,(H,11,12)"
pubChemDataResult$PC_Compounds[[1]]$props[[14]]
#> $urn
#> $urn$label
#> [1] "InChIKey"
#>
#> $urn$name
#> [1] "Standard"
#>
#> $urn$datatype
#> [1] 1
#>
#> $urn$version
#> [1] "1.0.6"
#>
#> $urn$software
#> [1] "InChI"
#>
#> $urn$source
#> [1] "iupac.org"
#>
#> $urn$release
#> [1] "2021.10.14"
#>
#>
#> $value
#> sval
#> "BSYNRYMUTXBXSQ-UHFFFAOYSA-N"
Log P: The partition coefficient, indicating the compound’s hydrophobicity.
pubChemDataResult$PC_Compounds[[1]]$props[[15]]
#> $urn
#> $urn$label
#> [1] "Log P"
#>
#> $urn$name
#> [1] "XLogP3"
#>
#> $urn$datatype
#> [1] 7
#>
#> $urn$version
#> [1] "3.0"
#>
#> $urn$source
#> [1] "sioc-ccbg.ac.cn"
#>
#> $urn$release
#> [1] "2021.10.14"
#>
#>
#> $value
#> fval
#> 1.2
Molecular Formula: The chemical formula of the compound, which is C9H8O4.
pubChemDataResult$PC_Compounds[[1]]$props[[17]]
#> $urn
#> $urn$label
#> [1] "Molecular Formula"
#>
#> $urn$datatype
#> [1] 1
#>
#> $urn$version
#> [1] "2.2"
#>
#> $urn$software
#> [1] "PubChem"
#>
#> $urn$source
#> [1] "ncbi.nlm.nih.gov"
#>
#> $urn$release
#> [1] "2021.10.14"
#>
#>
#> $value
#> sval
#> "C9H8O4"
Molecular Weight: The compound’s molecular weight, 180.16 g/mol.
pubChemDataResult$PC_Compounds[[1]]$props[[18]]
#> $urn
#> $urn$label
#> [1] "Molecular Weight"
#>
#> $urn$datatype
#> [1] 1
#>
#> $urn$version
#> [1] "2.2"
#>
#> $urn$software
#> [1] "PubChem"
#>
#> $urn$source
#> [1] "ncbi.nlm.nih.gov"
#>
#> $urn$release
#> [1] "2021.10.14"
#>
#>
#> $value
#> sval
#> "180.16"
SMILES: Canonical and isomeric Simplified Molecular Input Line Entry System (SMILES) strings, which are text representations of the chemical structure. Topological Polar Surface Area: A measure of the molecule’s surface area that can form hydrogen bonds.
pubChemDataResult$PC_Compounds[[1]]$props[[19]]
#> $urn
#> $urn$label
#> [1] "SMILES"
#>
#> $urn$name
#> [1] "Canonical"
#>
#> $urn$datatype
#> [1] 1
#>
#> $urn$version
#> [1] "2.3.0"
#>
#> $urn$software
#> [1] "OEChem"
#>
#> $urn$source
#> [1] "OpenEye Scientific Software"
#>
#> $urn$release
#> [1] "2021.10.14"
#>
#>
#> $value
#> sval
#> "CC(=O)OC1=CC=CC=C1C(=O)O"
For multiple IDs, a vector of IDs can be used. For instance, to retrieve a CSV table of compound properties:
result <- get_pug_rest(identifier = c("1","2","3","4","5"), namespace = "cid", domain = "compound", property = c("MolecularFormula","MolecularWeight","CanonicalSMILES"), output = "CSV")
result
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Compound
#> - Namespace: CID
#> - Operation: <NULL>
#> - Identifier: 1, 2, ... and 3 more.
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
The output of this request, when processed with pubChemData, provides a data frame containing the specified properties for each CID:
pubChemData(result)
#> CID MolecularFormula MolecularWeight CanonicalSMILES
#> 1 1 C9H17NO4 203.24 CC(=O)OC(CC(=O)[O-])C[N+](C)(C)C
#> 2 2 C9H18NO4+ 204.24 CC(=O)OC(CC(=O)O)C[N+](C)(C)C
#> 3 3 C7H8O4 156.14 C1=CC(C(C(=C1)C(=O)O)O)O
#> 4 4 C3H9NO 75.11 CC(CN)O
#> 5 5 C3H8NO5P 169.07 C(C(=O)COP(=O)(O)O)N
Each row in the table corresponds to a different CID and lists the requested properties, facilitating easy comparison and further analysis.
2. By Name: In addition to using direct identifiers, you can refer to a chemical by its common name. This method allows users to search for compounds using familiar names instead of numerical identifiers. It’s important to note that a single name might correspond to multiple records in the PubChem database. For example, the name “glucose” can refer to several different compounds or isomers. Here’s how you can retrieve Compound IDs (CIDs) for “glucose”:
result <- get_pug_rest(identifier = "glucose", namespace = "name", domain = "compound", operation = "cids", output = "TXT")
result
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Compound
#> - Namespace: Name
#> - Operation: cids
#> - Identifier: glucose
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
The pubChemData function is then used to process the result and extract the retrieved data. The function retrieves the output in data frame. The output indicates that the search for “glucose” returned a single CID:
pubChemData(result)
#> Text
#> 1 5793
This output reveals that the common name “glucose” corresponds to the CID 5793. This CID can then be used in further queries to retrieve detailed information about the compound, such as its molecular structure, properties, and associated bioactivities.
Using a common name for searching can simplify the process, especially when the numerical identifiers are not known. However, because a name can map to multiple records, the results might need further filtering or validation to ensure they correspond to the specific compound of interest.
3. By Structure Identity: Another method to specify a compound in PubChem PUG REST API requests is by using structural identifiers such as SMILES or InChI keys. This approach allows for precise identification of chemical structures by providing a textual representation of the molecule. For example, to retrieve the CID for the SMILES string “CCCC” (which represents butane), you can use the following code:
result <- get_pug_rest(identifier = "CCCC", namespace = "smiles", domain = "compound", operation = "cids", output = "TXT")
result
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Compound
#> - Namespace: SMILES
#> - Operation: cids
#> - Identifier: CCCC
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
When the pubChemData(result) function is executed, it retrieves the output in data frame. The output indicates that the search for the SMILES string “CCCC” returned a single CID:
pubChemData(result)
#> Text
#> 1 7843
This output reveals that the SMILES string “CCCC” corresponds to the CID 7843. This CID can then be used in further queries to gather detailed information about the compound, such as its molecular structure, physical and chemical properties, biological activities, and more.
Using structure-based identifiers like SMILES or InChI keys is particularly useful for precise and unambiguous chemical searches, as these identifiers provide a detailed representation of the molecule’s structure. This method ensures that the exact compound of interest is identified, reducing the risk of ambiguity that might arise with common names or other identifiers.
4. By Fast (Synchronous) Structure Search: In PubChem PUG REST API, fast (synchronous) structure search allows for quicker searches by identity, similarity, substructure, and superstructure, often returning results in a single call. This method is efficient for obtaining results quickly and is useful for various types of structural queries.
To illustrate, let’s perform a fast identity search for the compound with CID 5793, using the same connectivity option:
result <- get_pug_rest(identifier = "5793", namespace = "cid", domain = "compound", operation = "cids", output = "TXT", searchtype = "fastidentity", options = list(identity_type = "same_connectivity"))
result
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Compound
#> - Namespace: CID
#> - Operation: cids
#> - Identifier: 5793
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
When the following code executed, it retrieves the output in data frame, listing all CIDs that match the fast identity search criteria.
pubChemData(result)
#> Text
#> 1 5793
#> 2 206
#> 3 6036
#> 4 18950
#> 5 64689
#> 6 66371
#> 7 79025
#> 8 81696
#> 9 104724
#> 10 185698
#> 11 439353
#> 12 439357
#> 13 439507
#> 14 439583
#> 15 439680
#> 16 441032
#> 17 441033
#> 18 441034
#> 19 441035
#> 20 444314
#> 21 448388
#> 22 448702
#> 23 451187
#> 24 451188
#> 25 451189
#> 26 452245
#> 27 455147
#> 28 657055
#> 29 1549080
#> 30 2724488
#> 31 3000450
#> 32 3034742
#> 33 5319264
#> 34 6102790
#> 35 6321330
#> 36 6323336
#> 37 6400264
#> 38 6560213
#> 39 6971003
#> 40 6971007
#> 41 6971016
#> 42 6971096
#> 43 6971097
#> 44 6971098
#> 45 6992021
#> 46 6992084
#> 47 7018164
#> 48 7043897
#> 49 7044038
#> 50 7098663
#> 51 7098664
#> 52 7157007
#> 53 9794056
#> 54 9815418
#> 55 9834129
#> 56 9899007
#> 57 10035228
#> 58 10081060
#> 59 10103794
#> 60 10130220
#> 61 10197954
#> 62 10219674
#> 63 10219763
#> 64 10313382
#> 65 10329946
#> 66 10899282
#> 67 10954241
#> 68 11019447
#> 69 11030410
#> 70 11344362
#> 71 11367383
#> 72 11412863
#> 73 11480819
#> 74 11492034
#> 75 11571906
#> 76 11571917
#> 77 11600783
#> 78 11651921
#> 79 11672764
#> 80 11959770
#> 81 11970126
#> 82 12003287
#> 83 12193653
#> 84 12285853
#> 85 12285856
#> 86 12285861
#> 87 12285862
#> 88 12285863
#> 89 12285866
#> 90 12285868
#> 91 12285869
#> 92 12285870
#> 93 12285871
#> 94 12285873
#> 95 12285877
#> 96 12285878
#> 97 12285879
#> 98 12285885
#> 99 12285886
#> 100 12285889
#> 101 12285890
#> 102 12285891
#> 103 12285892
#> 104 12285893
#> 105 12285894
#> 106 16054987
#> 107 16211884
#> 108 16211941
#> 109 16211984
#> 110 16211986
#> 111 16212959
#> 112 16212960
#> 113 16212966
#> 114 16213546
#> 115 16213640
#> 116 16213872
#> 117 16217112
#> 118 16219580
#> 119 21355827
#> 120 22825318
#> 121 22836365
#> 122 22836366
#> 123 23424086
#> 124 24728695
#> 125 24802149
#> 126 24802163
#> 127 24802281
#> 128 24892722
#> 129 42626680
#> 130 44328781
#> 131 44328785
#> 132 46188479
#> 133 46780441
#> 134 46897877
#> 135 50939543
#> 136 51340651
#> 137 54445181
#> 138 54445182
#> 139 56845432
#> 140 56845995
#> 141 57197748
#> 142 57288387
#> 143 57483528
#> 144 57691826
#> 145 57973135
#> 146 58070804
#> 147 58265153
#> 148 58265160
#> 149 58265166
#> 150 58265178
#> 151 58265190
#> 152 58265196
#> 153 58300638
#> 154 58594768
#> 155 58595959
#> 156 58618581
#> 157 58969552
#> 158 59034276
#> 159 59036328
#> 160 59040622
#> 161 59083882
#> 162 59105109
#> 163 59125088
#> 164 59146659
#> 165 59383280
#> 166 59445439
#> 167 59503407
#> 168 59503411
#> 169 59886072
#> 170 59965103
#> 171 60052896
#> 172 60078648
#> 173 66629908
#> 174 67518639
#> 175 67615000
#> 176 67615455
#> 177 67641738
#> 178 67938791
#> 179 67944215
#> 180 67944290
#> 181 67950444
#> 182 68167579
#> 183 68324677
#> 184 68334110
#> 185 69528681
#> 186 70443535
#> 187 70543261
#> 188 71309028
#> 189 71309128
#> 190 71309129
#> 191 71309140
#> 192 71309397
#> 193 71309503
#> 194 71309513
#> 195 71309514
#> 196 71309671
#> 197 71309852
#> 198 71309905
#> 199 71309908
#> 200 71309927
#> 201 71317094
#> 202 71317095
#> 203 71317096
#> 204 71317097
#> 205 71317182
#> 206 71777654
#> 207 75357255
#> 208 76973265
#> 209 86278404
#> 210 87297824
#> 211 87929779
#> 212 87931119
#> 213 88255060
#> 214 88547603
#> 215 88974141
#> 216 89000581
#> 217 89200515
#> 218 89332529
#> 219 89374440
#> 220 89424182
#> 221 89742272
#> 222 89855666
#> 223 90057933
#> 224 90159939
#> 225 90346255
#> 226 90470917
#> 227 90472751
#> 228 90472752
#> 229 90472753
#> 230 90472761
#> 231 90472762
#> 232 90472770
#> 233 90473076
#> 234 90781811
#> 235 90895196
#> 236 91057721
#> 237 92043367
#> 238 92043446
#> 239 101015849
#> 240 101033892
#> 241 101254308
#> 242 101254309
#> 243 101254310
#> 244 101254311
#> 245 101254312
#> 246 101254313
#> 247 101254314
#> 248 101254315
#> 249 101469918
#> 250 101513786
#> 251 101718250
#> 252 101718251
#> 253 101796201
#> 254 102089288
#> 255 102447462
#> 256 102447463
#> 257 102601142
#> 258 102601177
#> 259 102601371
#> 260 102601743
#> 261 102601816
#> 262 117064633
#> 263 117064644
#> 264 117065485
#> 265 117633116
#> 266 117768413
#> 267 117938207
#> 268 118797420
#> 269 118797610
#> 270 118797621
#> 271 118797622
#> 272 118855887
#> 273 118855889
#> 274 118855904
#> 275 118855910
#> 276 118855920
#> 277 118855925
#> 278 118924468
#> 279 121494046
#> 280 121494058
#> 281 122360911
#> 282 122522140
#> 283 125309563
#> 284 125353406
#> 285 126704391
#> 286 129629038
#> 287 131698424
#> 288 131698425
#> 289 131698450
#> 290 131699179
#> 291 131842051
#> 292 131966764
#> 293 132939819
#> 294 132939820
#> 295 133119158
#> 296 133119249
#> 297 133121364
#> 298 133662560
#> 299 133662561
#> 300 133662562
#> 301 134695353
#> 302 134860471
#> 303 136898365
#> 304 137554722
#> 305 139025182
#> 306 140565377
#> 307 141697522
#> 308 142119910
#> 309 145874935
#> 310 153238571
#> 311 153238579
#> 312 154008122
#> 313 154724068
#> 314 155920671
#> 315 156615593
#> 316 162642814
#> 317 166450901
#> 318 166606713
#> 319 168357285
#> 320 168358828
#> 321 168361185
#> 322 168361364
#> 323 168361934
#> 324 168362043
#> 325 168363258
#> 326 168364204
#> 327 168365395
#> 328 168365574
#> 329 168366670
#> 330 168371870
#> 331 168372103
#> 332 168375045
#> 333 168375837
#> 334 168375921
#> 335 169440996
#> 336 169440997
#> 337 169493886
#> 338 169493887
#> 339 169494552
The output indicates that there are numerous compounds with similar connectivity to the compound identified by CID 5793. Each row in the output represents a different CID that shares the same connectivity pattern as the original compound. This extensive list includes hundreds of CIDs, showcasing the effectiveness of the fast identity search in identifying structurally related compounds quickly.
This method is advantageous for researchers needing to identify compounds with similar structures for further study, such as drug development, chemical analysis, or bioactivity screening. The fast response time and comprehensive results make it a valuable tool for various chemical and pharmaceutical applications.
5. By Cross-Reference (XRef): The cross-reference (XRef) method allows for reverse lookup of records using a cross-reference value. This method is particularly useful for linking external identifiers, such as patent numbers, to records in the PubChem database. For example, to find all SIDs linked to a specific patent identifier, you can use the following code:
result <- get_pug_rest(identifier = "US20050159403A1", namespace = "xref/PatentID", domain = "substance", operation = "sids", output = "JSON")
result
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Substance
#> - Namespace: DomainSpecific
#> - Operation: sids
#> - Identifier: US20050159403A1
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
pubChemData function retrieves all SIDs that are linked to the specified patent identifier. The output indicates that the specified patent identifier “US20050159403A1” is linked to a large number of SIDs. Each SID represents a different substance that has been referenced in the patent.
pubChemData(result)
#> $IdentifierList
#> $IdentifierList$SID
#> [1] 127377127 127382334 127386382 127387376 127394906 127397376 127401897
#> [8] 127417654 127421022 127427561 127429426 127432264 127440404 127440428
#> [15] 127445646 127449926 127457867 127460434 127463574 127464134 127494894
#> [22] 127496937 127498548 127529998 127543991 127546018 127550965 127560184
#> [29] 127566278 127568614 127579066 127599056 127600815 127611817 127623534
#> [36] 127625503 127632433 127671807 127679107 127688846 127691718 127697988
#> [43] 127711332 127715174 127721611 127722233 127724090 127724974 127727269
#> [50] 127740859 127774458 127786062 127790664 127793921 127794507 127795243
#> [57] 127815192 127824113 127825299 127832880 127835824 127841389 127851147
#> [64] 127855878 127884538 127887083 127907515 127915891 127918344 127926702
#> [71] 127931871 127943403 127949193 127959361 127959362 127961945 127968700
#> [78] 127974435 127988890 128003687 128016428 128024305 128030061 128033308
#> [85] 128033392 128037075 128041901 128043208 128054804 128071903 128075872
#> [92] 128078936 128080424 128086739 128096360 128107971 128108063 128108066
#> [99] 128112623 128116501 128122481 128133149 128139949 128147430 128149266
#> [106] 128164707 128171553 128181006 128199891 128202584 128204869 128208747
#> [113] 128209556 128214152 128216313 128231608 128233622 128235075 128239224
#> [120] 128247683 128253922 128254582 128257870 128265053 128273912 128287030
#> [127] 128289228 128292761 128310374 128310447 128311517 128313782 128319875
#> [134] 128321254 128324845 128327185 128328645 128338633 128344584 128358565
#> [141] 128360063 128361597 128370060 128375029 128381421 128383506 128383593
#> [148] 128385126 128390596 128391357 128391640 128397827 128407104 128414752
#> [155] 128417831 128417970 128421724 128427165 128429151 128431035 128453144
#> [162] 128454426 128459192 128479099 128485819 128490532 128490533 128496172
#> [169] 128504855 128514984 128515903 128531133 128539521 128548161 128550853
#> [176] 128555702 128571383 128573137 128579725 128582246 128587585 128600655
#> [183] 128620771 128624545 128627105 128628584 128632557 128632828 128633726
#> [190] 128633902 128634258 128639692 128642692 128646882 128649366 128675326
#> [197] 128675957 128689052 128690685 128693396 128694828 128699310 128702849
#> [204] 128706476 128716066 128720357 128723298 128723509 128731938 128749024
#> [211] 128749163 128751829 128757815 128766098 128776154 128783768 128789021
#> [218] 128789967 128791547 128796670 128801133 128801849 128804888 128804889
#> [225] 128808454 128815642 128821150 128833107 128837247 128846910 128856633
#> [232] 128862719 128865088 128873754 128874253 128880985 128881190 128894056
#> [239] 128894682 128905743 128908713 128915047 128915536 128921696 128931918
#> [246] 128934857 128935768 128953776 128955396 128970283 128981506 128988380
#> [253] 128997690 129001067 129007663 129014483 129015035 129015817 129016179
#> [260] 129017989 129017990 129026427 129044226 129053129 129054986 129068688
#> [267] 129068801 129079556 129084855 129102578 129104885 129111825 129113828
#> [274] 129115030 129122432 129128697 129130686 129133094 129136176 129140081
#> [281] 129141617 129142896 129148181 129162993 129168398 129173408 129173457
#> [288] 129176810 129177288 129184645 129192608 129202881 129203714 129205760
#> [295] 129209673 129218442 129224837 129228814 129233871 129233873 129233874
#> [302] 129255832 129266266 129272170 129288442 129292073 129294634 129302252
#> [309] 129309716 129310311 129313799 129317146 129328896 129333065 129337106
#> [316] 129337589 129345232 129349967 129351897 129359174 129365621 129366270
#> [323] 129370686 129371805 129374061 129376455 129378267 129379472 129383430
#> [330] 129384458 129389744 129403822 129421538 129440903 129455218 129456175
#> [337] 129457573 129462203 129479103 129489229 129493774 129504746 129508593
#> [344] 129517263 129521307 129526420 129534168 129538802 129547789 129551052
#> [351] 129551258 129555666 129562150 129573817 129586364 129598381 129614967
#> [358] 129617515 129624957 129625178 129625510 129627772 129627879 129629301
#> [365] 129632951 129670298 129672175 129685831 129688258 129702886 129706999
#> [372] 129709395 129711942 129715311 129726350 129728891 129729719 129736495
#> [379] 129738647 129764614 129770055 129776398 129781417 129797152 129808214
#> [386] 129809046 129809280 129811709 129815623 129815624 129818107 129838999
#> [393] 129842321 129844561 129845709 129855419 129865464 129871639 129873187
#> [400] 129873949 129879637 129880907 129894085 129894756 129894757 129906668
#> [407] 129908561 129912020 129921074 129935955 129936330 129941294 129945069
#> [414] 129950653 129962601 129975648 134359275 134393961 134412235 135747269
#> [421] 135752889 135754219 135763149 135798297 135802614 135806152 135826617
#> [428] 135837077 135842328 135847369 135857922 135878265 135887484 135892282
#> [435] 135900333 135900801 135946050 135971530 136022349 136022812 136044641
#> [442] 136052383 136057158 136072993 136080169 136099624 136116271 136118214
#> [449] 136126969 136156103 136157629 136161084 136172216 136222651 136227638
#> [456] 136234395 136261580 136265581 136269831 136272977 136273523 136275074
#> [463] 136291195 136291391 136302881 136314053 136324561 136326756 137349406
#> [470] 137349415 139042442 139667196 140037191 140037194 140254002 140254003
#> [477] 140254004 140254005 140254006 140254007 140254008 140254009 140254010
#> [484] 140254011 140254012 140254013 140254014 140254015 140254016 140254017
#> [491] 140254018 140254019 140254020 140254021 140254022 140254023 140254024
#> [498] 140254025 140254026 140254027 140254028 140254029 140254030 140254031
#> [505] 140254032 140254033 140254034 140254035 140254036 140254037 140254038
#> [512] 140254039 140254040 140254041 140254042 140254043 140254044 140254045
#> [519] 140254046 140254047 140254048 140254049 140254050 140254051 140254052
#> [526] 140254053 140254054 140254055 140254056 140254057 140254058 140254059
#> [533] 140254060 140254061 140254062 140254063 140254064 140254065 140254066
#> [540] 140254067 140254068 140254069 140254070 140254071 140254072 140254073
#> [547] 140254074 140254075 140254076 140254077 140254078 140254079 140254080
#> [554] 140254081 140254082 140254083 140254084 140254085 140254086 140254087
#> [561] 140254088 140254089 140254090 140254091 140254092 140254093 140254094
#> [568] 140254095 140254096 140254097 140254098 140254099 140254100 140262425
#> [575] 140262449 140262456 140262460 140262471 140262478 140262480 140262481
#> [582] 140262486 140262491 140262504 140297441 140297443 140343485 140343488
#> [589] 140343490 140343491 140343493 140343497 140343499 140343505 140492225
#> [596] 141972099 141972100 141989468 142047558 142047559 142059950 142086609
#> [603] 142090637 142093192 142106350 142106351 142106355 142106365 142106373
#> [610] 142130212 142216111 142266779 142289307 142290035 142292362 142292867
#> [617] 142300477 142301161 142355996 142359434 142359441 142365242 142395042
#> [624] 142395064 142396928 142396930 142396931 142398573 142405238 142418631
#> [631] 142438154 142448656 142456985 142456989 142456990 142456991 142456995
#> [638] 142457001 142457025 142457027 142457031 142457036 142457040 142457045
#> [645] 142457053 142457055 142457058 142457059 142457063 142457067 142457072
#> [652] 142457075 142457076 142457077 142457082 142457086 142457092 142457095
#> [659] 142457098 142457099 142457106 142457115 142457119 142457130 142457134
#> [666] 142457138 142457141 142457144 142457146 142457150 142457151 142457164
#> [673] 142457170 142457176 142457178 142457181 142457183 142457188 142457195
#> [680] 142457199 142457201 142457202 142457210 142457212 142457213 142457217
#> [687] 142472709 142504653 142517754 142718650 142794296 142794297 142794298
#> [694] 142976674 142976694 143004316 143078051 143078053 143078054 143078056
#> [701] 143078060 143078078 143078326 143091452 143251864 143251865 143251867
#> [708] 143251868 143251869 143251871 143251875 143251881 143251883 143251885
#> [715] 143251886 143251887 143251888 143251890 143251892 143251894 143251895
#> [722] 143274528 143298830 143319100 226393239 226393293 226393570 226393684
#> [729] 226393819 226393829 226393830 226393853 226393858 226393970 226393971
#> [736] 226394040 226394128 226394161 226394178 226394363 226394438 226394481
#> [743] 226394721 226394839 226395300 226395327 226395471 226395557 226395600
#> [750] 226395667 226395683 226395726 226395777 226395791 226395832 226395919
#> [757] 226395920 226395976 226395977 226396049 226396167 226396198 226396401
#> [764] 226396434 226396445 226396464 226396553 226396598 226396617 226396621
#> [771] 226396627 226397203 226397214 226397352 226397529 226397954 226397955
#> [778] 226398150 226398151 226398302 226398454 226399315 226399316 226399355
#> [785] 226399517 226399539 226399589 226400026 226400044 226400188 226400189
#> [792] 226400247 226400424 226401770 226402238 226405416 226405489 226405836
#> [799] 226406284 226406322 226406323 226406337 226406377 226406378 226406714
#> [806] 226406715 226406888 226407848 226408395 226408853 226409030 226409597
#> [813] 226409697 226410719 226411068 226411348 226411370 226411403 226411806
#> [820] 226412544 226412844 226412845 226412937 226413104 226413385 226413386
#> [827] 226413604 226413605 226413606 226414206 226414262 226414263 226414297
#> [834] 226414324 226419993 226419994 226420042 226420160 226420161 226420292
#> [841] 226420293 226420456 226420457 226420566 226420567 226420640 226420681
#> [848] 226420720 226420751 226420773 226420774 226420775 226420787 226420804
#> [855] 226420877 226421001 226421002 226421003 226421699 226421728 226423326
#> [862] 226423740 226423826 226424221 226424268 226424269 226424411 226424412
#> [869] 226424915 226425111 226425112 226425266 226425785 226425786 226425915
#> [876] 226425916 226426085 226426288 226426344 226426345 226426582 226426623
#> [883] 226426904 226427860 226427861 226428721 226428722 226428885 226429123
#> [890] 226429124 226432865 226432866 226433067 226433069 226433070 226433090
#> [897] 226433091 226433125 226433126 226433141 226433192 226433193 226433202
#> [904] 226433203 226433344 226433345 226433523 226433524 226433876 226434160
#> [911] 226434771 226440330 226441562 226443823 226445144 226457181 226457266
#> [918] 226457330 226457331 226457556 226458114 226458225 226458539 226459695
#> [925] 226461214 226462543 226471608 226471621 226472159 226472161 226487293
#> [932] 226487863 226487864 226489321 226490310 226490791 226492265 226492375
#> [939] 226492435 226492437 226492438 226493640 226494285 226494502 226495017
#> [946] 226495025 226497342 226503391 226503607 226506442 226506563 226516040
#> [953] 226520272 226521124 226542800 226553920 226554136 226564914 226567033
#> [960] 226567034 226567043 226567109 226567110 226567180 226567194 226567379
#> [967] 226567380 226567400 226567435 226567443 226567450 226567458 226567459
#> [974] 226567556 226567567 226567574 226567579 226567629 226567646 226567656
#> [981] 226567670 226567681 226567699 226567701 226567702 226567706 226567717
#> [988] 226567727 226567759 226567799 226567800 226567820 226567824 226567833
#> [995] 226567848 226567862 226567886 226567894 226567923 226567935 226567950
#> [1002] 226567961 226567989 226568036 226568060 226568061 226568066 226568082
#> [1009] 226568123 226568127 226568139 226568146 226568177 226568234 226568241
#> [1016] 226568245 226568264 226568294 226568315 226568346 226568349 226568359
#> [1023] 226568362 226568364 226568408 226568416 226568417 226568432 226568449
#> [1030] 226568457 226568460 226568463 226568464 226568465 226568530 226568537
#> [1037] 226568633 226568641 226568645 226568657 226568663 226568670 226568680
#> [1044] 226568696 226568699 226568700 226568701 226568711 226568713 226568725
#> [1051] 226568734 226568738 226568746 226568772 226568798 226568876 226568911
#> [1058] 226568928 226568929 226568941 226568980 226569025 226569064 226569099
#> [1065] 226569118 226569177 226569196 226569204 226569217 226569220 226569226
#> [1072] 226569232 226569328 226569330 226569349 226569414 226569437 226569451
#> [1079] 226569464 226569471 226569516 226569535 226569554 226569573 226569593
#> [1086] 226569594 226569596 226569607 226569623 226569624 226569625 226569648
#> [1093] 226569680 226569683 226569748 226569754 226569769 226569783 226569821
#> [1100] 226569829 226569832 226569836 226569850 226569859 226569892 226569907
#> [1107] 226569908 226569912 226569959 226569976 226569984 226569988 226570060
#> [1114] 226570099 226570146 226570279 226570308 226570330 226570331 226570363
#> [1121] 226570382 226570383 226570483 226570745 226570749 226570755 226570849
#> [1128] 226571011 226573101 226573345 226574052 226574439 226582522 226584018
#> [1135] 226589272 226589273 226597434 226597505 226600715 226600746 226647066
#> [1142] 226648068 226656069 226667895 226668614 226668738 226668855 226669423
#> [1149] 226669639 226670087 226683506 226696380 226710719 226823612 226871249
#> [1156] 226876459 226876460 226936194 226965286 226972501 226973045 226973164
#> [1163] 227122429 227236737 227281952 227325851 227339001 227384742 227409508
#> [1170] 227461168 227461872 227474526 227511968 227511969 227564952 227748570
#> [1177] 227960305 227987782 227988458 227990171 227990458 227991140 228067840
#> [1184] 228093688 228094302 228094400 228094598 228099363 228140626 228159764
#> [1191] 228159766 228248383 228248928 228263743 228265260 228304181 228304246
#> [1198] 228355922 228361147 228452780 228639103 228770559 228770560 228819438
#> [1205] 228821051 228821054 228821055 228821056 228821057 228821058 228821059
#> [1212] 229280318 229294930 229334353 229337219 229634546 230157026 230160664
#> [1219] 230162107 230331623 230331934 230335306 230336805 230336809 230337506
#> [1226] 230337861 230338660 230338865 230922657 231379826 231380229 231380490
#> [1233] 231380595 231408598 231804558 231804593 231804765 231805001 231805228
#> [1240] 231805267 231805479 231805547 231805598 231805601 231805607 232099662
#> [1247] 232168843 242681556 242717650 242881388 242920991 242939316 243086546
#> [1254] 243093644 243247276 243279817 243310463 243530634 243738751 243781361
#> [1261] 243836721 244274765 244304819 244351778 244473144 244482943 244615862
#> [1268] 244627129 244636869 244759771 244861749 245029031 245033760 245283852
#> [1275] 245309404 245363541 245370020 245708899 245712680 245833937 245910951
#> [1282] 246055152 246059618 246152086 246185335 246302557 246328178 246352373
#> [1289] 246554028 246643596 246771767 246837580 246918682 246999746 247221074
#> [1296] 247555200 247576141 247608871 247940116 248042834 248050678 248370740
#> [1303] 248382919 248412763 248454658 248521494 248564737 248603625 248633503
#> [1310] 248880456 248936405 248962580 249096157 406908050 406924010 406974008
#> [1317] 407050344 407246776 407400630 407406725 407454637 407458795 407597864
#> [1324] 407694713 408147735 408215224 408217559 408259874 408364713 408470213
#> [1331] 408624647 408901036 409227565 409282571 409314082 409433585 409471800
#> [1338] 409673423 409688950 409807984 409808036 409844257 409847976 409898992
#> [1345] 409899538 409899617 409900700 409918668 409918859 409956750 410076269
#> [1352] 410111933 410112156 410199016 410250219 410310907 410320935 410358412
#> [1359] 410358870 410358993 410364139 410368205 410390548 410443241 410611603
#> [1366] 410656308 410714557 410730470 410800662 410813610 410878039 410889153
#> [1373] 411029050 411120346 411209812 411236714 411322314 411481111 411739633
#> [1380] 411848723 412068746 412190947 412223907 412476987 412478856 412516310
#> [1387] 412540966 412700950 412759940 412769395 412771380 412892589 413247619
#> [1394] 413431688 413450070 413450927 413457356 413457635 413458038 413458045
#> [1401] 413478250 413485032 413490608 413490982 413505830 413505848 413505864
#> [1408] 413505881 413515875 413516940 413517191 413517594 413517711 413517787
#> [1415] 413518248 413518286 413518452 413555803 413573934 413576342 413577607
#> [1422] 413578969 413583708 413584099 413596608 413598583 413598590 413647181
#> [1429] 413911986 413963852 414064183 414626516 414661397 414825813 414856510
#> [1436] 414892091 414935619 414945852 414956160 415069749 415110648 415136092
#> [1443] 415714536 415727884 415728598 415729189 415729254 415729316 415732559
#> [1450] 415732675 415732908 415732945 415732955 415732961 415732979 415733001
#> [1457] 415733004 415740875 415748816 415748861 415748865 415748910 415749238
#> [1464] 415749262 415749353 415750692 415812290 415816885 415821932 415824540
#> [1471] 415827755 415827762 415828559 415829165 415830681 415835565 415850432
#> [1478] 415850705 415851694 415853441 415855021 415857010 415859316 415871129
#> [1485] 415872890 415873239 415875742 415939267 415966755 415966911 415967507
#> [1492] 416129933 416132545 419476058 419476730 419479763 419480873 419480972
#> [1499] 419485881 419486666 419490957 419491152 419491791 419492671 419492675
#> [1506] 419492924 419493503 419494003 419494133 419494704 419495016 419495902
#> [1513] 419495947 419496295 419497698 419497700 419497832 419499306 419499403
#> [1520] 419499559 419499585 419499632 419499643 419499837 419499846 419499949
#> [1527] 419500086 419500181 419500734 419500914 419501182 419501547 419501734
#> [1534] 419502131 419502930 419503347 419503687 419503883 419503967 419504952
#> [1541] 419505428 419506158 419506282 419506849 419507096 419507450 419507595
#> [1548] 419510089 419512824 419513193 419513201 419513968 419514082 419514092
#> [1555] 419514098 419514102 419514108 419514115 419514117 419516874 419523755
#> [1562] 419524130 419524875 419525381 419526621 419527521 419531719 419531724
#> [1569] 419531750 419531947 419532553 419532839 419534378 419534410 419534469
#> [1576] 419534484 419535046 419535071 419535113 419536034 419536121 419546112
#> [1583] 419549332 419551105 419551195 419551405 419551879 419551899 419551910
#> [1590] 419552498 419553075 419553096 419553297 419554682 419555018 419555613
#> [1597] 419555769 419557728 419559147 419559194 419559505 419561476 419572553
#> [1604] 419574667 419574686 419578517 419585247 419590036 419590658 419590946
#> [1611] 419591129 419591195 419591612 419592226 419593794 419595719 419681507
#> [1618] 419682519 419690476 419700929 419701108 419705795 419714072 419714186
#> [1625] 419721779 419722600 419724974 419742001 419765198 419765456 419968009
#> [1632] 420012827 420028738 420028752 420028796 420028825 420121099 420132130
#> [1639] 420290406 420482162 420509561 420599383 420599583 420611634 420613278
#> [1646] 420625825 420634933 420662981 420668293 420713373 420762856 420764388
#> [1653] 421051216 421054472 421152487 421168189 421185220 421200485 421285150
#> [1660] 421467012 421467021 421467167 421467814 421467821 421467841 421468181
#> [1667] 421618729 421645675 422224723 422286156 422291469 422291508 422291541
#> [1674] 422291868 422294992 422295060 422302674 422302967 422308315 422308547
#> [1681] 422309266 422309809 422319370 422361298 422394972 422395317 422395612
#> [1688] 422397162 422397173 422397290 422423005 422425739 422425747 422425769
#> [1695] 422478595 422511204 422511386 422536544 422536551 422537057 422537069
#> [1702] 422537094 422537110 422537118 422550741 422568394 422568403 422571327
#> [1709] 422571343 422571354 422571402 422573560 422573909 423390419 424119783
#> [1716] 424296317 424306124 424306736 424309010 424309022 424757743 424766567
#> [1723] 424766737 424803005 424818908 424819603 424819823 424819832 424849043
#> [1730] 424877080 425113081 425114186 425165937 425171162 425193155 425312104
#> [1737] 425396324 425516770 425535208 425897534 425961463 425965903 425967206
#> [1744] 426045728 426078961 426089733 426103724 426106692 426108038 426109622
#> [1751] 426110449 426119983 426121923 426147338 426149755 426150549 426150787
#> [1758] 426157394 426159592 426272317 426288672 426294028 426310077 426324267
#> [1765] 426359948 426412333 426432571 426442188 426459973 426517444 427700056
#> [1772] 431577953 432891950 433323349 433323597 433567908 447595048 447853768
#> [1779] 447923060 448050232 448532098 448892836 449326453 449676699 449681106
#> [1786] 449910701 450051033 450232513 450235101 450485725 450532993 450630545
#> [1793] 450682277 451013988 451180599 451555494 451961792 452145594 452311423
#> [1800] 452516785 452986205 453242316 453516563 453664627 453678106 453867505
#> [1807] 453889148 454241035 454274907 454483135 454611257 454631768 456171974
#> [1814] 456295911 456370357 456371022 456411915 456468049 456636612 456671281
#> [1821] 456739466 456796473 456977526 457169682 457192620 457203572 457280508
#> [1828] 457282502 457310819 457311044 457459970 457460012 457584123 457655803
#> [1835] 457673794 457673796 457815268 457842244 458157987 458347242 458392451
#> [1842] 458393276 458393605 458393636 458393652 458393705 458393708 458393785
#> [1849] 458393799 458393801 458394569 458394830 458394859 458394861 458395827
#> [1856] 458396401 458396411 458396423 458397310 458397365 458427722 466349722
#> [1863] 482532917
This result provides a comprehensive list of substances associated with the specified patent, allowing for further exploration and analysis within the PubChem database.
Once you’ve specified the records of interest in PUG REST, the next step is to define what information you want to retrieve about these records. PUG REST excels in providing access to specific data points about each record, such as individual properties or cross-references, without the need to download and sift through large datasets.
1. Full Records: PUG REST allows the retrieval of entire records in various formats like JSON, CSV, TXT, and SDF. For example, to retrieve the record for aspirin (CID 2244) in SDF format:
result <- get_pug_rest(identifier = "2244", namespace = "cid", domain = "compound", output = "SDF")
Multiple records can also be requested in a single call, though large lists may be subject to timeouts.
2. Images: Images of chemical structures can be retrieved by specifying PNG format. This works with various input methods, including chemical names, SMILES strings, and InChI keys. For example, to get an image for the chemical name “lipitor”:
get_pug_rest(identifier = "lipitor", namespace = "name", domain = "compound", output = "PNG")
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Compound
#> - Namespace: Name
#> - Operation: <NULL>
#> - Identifier: lipitor
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
3. Compound Properties: Pre-computed properties for PubChem compounds are accessible individually or in tables. For instance, to get the molecular weight of a compound:
result <- get_pug_rest(identifier = "2244", namespace = "cid", domain = "compound", property = "MolecularWeight", output = "TXT")
result
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Compound
#> - Namespace: CID
#> - Operation: <NULL>
#> - Identifier: 2244
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
pubChemData(result)
#> Text
#> 1 180.16
Or to retrieve a CSV table of multiple compounds and properties:
result <- get_pug_rest(identifier = c("1","2","3","4","5"), namespace = "cid", domain = "compound", property = c("MolecularWeight", "MolecularFormula", "HBondDonorCount", "HBondAcceptorCount", "InChIKey", "InChI"), output = "CSV")
pubChemData(result)
#> CID MolecularWeight MolecularFormula HBondDonorCount HBondAcceptorCount
#> 1 1 203.24 C9H17NO4 0 4
#> 2 2 204.24 C9H18NO4+ 1 4
#> 3 3 156.14 C7H8O4 3 4
#> 4 4 75.11 C3H9NO 2 2
#> 5 5 169.07 C3H8NO5P 3 6
#> InChIKey
#> 1 RDHQFKQIGNGIED-UHFFFAOYSA-N
#> 2 RDHQFKQIGNGIED-UHFFFAOYSA-O
#> 3 INCSWYKICIYAHB-UHFFFAOYSA-N
#> 4 HXKKHQJGJAFBHI-UHFFFAOYSA-N
#> 5 HIQNVODXENYOFK-UHFFFAOYSA-N
#> InChI
#> 1 InChI=1S/C9H17NO4/c1-7(11)14-8(5-9(12)13)6-10(2,3)4/h8H,5-6H2,1-4H3
#> 2 InChI=1S/C9H17NO4/c1-7(11)14-8(5-9(12)13)6-10(2,3)4/h8H,5-6H2,1-4H3/p+1
#> 3 InChI=1S/C7H8O4/c8-5-3-1-2-4(6(5)9)7(10)11/h1-3,5-6,8-9H,(H,10,11)
#> 4 InChI=1S/C3H9NO/c1-3(5)2-4/h3,5H,2,4H2,1H3
#> 5 InChI=1S/C3H8NO5P/c4-1-3(5)2-9-10(6,7)8/h1-2,4H2,(H2,6,7,8)
4. Synonyms: To view all synonyms of a compound, such as Vioxx:
result <- get_pug_rest(identifier = "vioxx", namespace = "name", domain = "compound", operation = "synonyms", output = "JSON")
result
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Compound
#> - Namespace: Name
#> - Operation: synonyms
#> - Identifier: vioxx
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
pubChemData(result)
#> $InformationList
#> $InformationList$Information
#> $InformationList$Information[[1]]
#> $InformationList$Information[[1]]$CID
#> [1] 5090
#>
#> $InformationList$Information[[1]]$Synonym
#> [1] "rofecoxib"
#> [2] "162011-90-7"
#> [3] "Vioxx"
#> [4] "Ceoxx"
#> [5] "MK 966"
#> [6] "4-(4-(Methylsulfonyl)phenyl)-3-phenylfuran-2(5H)-one"
#> [7] "refecoxib"
#> [8] "Vioxx Dolor"
#> [9] "MK-966"
#> [10] "4-[4-(methylsulfonyl)phenyl]-3-phenylfuran-2(5H)-one"
#> [11] "MK0966"
#> [12] "4-[4-(methylsulfonyl)phenyl]-3-phenyl-2(5H)-furanone"
#> [13] "MK 0966"
#> [14] "MK-0966"
#> [15] "rofecoxibum"
#> [16] "CCRIS 8967"
#> [17] "HSDB 7262"
#> [18] "TRM-201"
#> [19] "UNII-0QTW8Z7MCR"
#> [20] "0QTW8Z7MCR"
#> [21] "3-(4-methylsulfonylphenyl)-4-phenyl-2H-furan-5-one"
#> [22] "2(5H)-Furanone, 4-[4-(methylsulfonyl)phenyl]-3-phenyl-"
#> [23] "NSC-720256"
#> [24] "NSC-758705"
#> [25] "CHEBI:8887"
#> [26] "DTXSID2023567"
#> [27] "M01AH02"
#> [28] "MK966"
#> [29] "3-phenyl-4-[4-(methylsulfonyl)phenyl]-2(5H)-furanone"
#> [30] "4-(4-(Methylsulfonyl)phenyl)-3-phenyl-2(5H)-furanone"
#> [31] "4-(p-(Methylsulfonyl)phenyl)-3-phenyl-2(5H)-furanone"
#> [32] "3-Phenyl-4-(4-(methylsulfonyl)phenyl))-2(5H)-furanone"
#> [33] "4-(4-methanesulfonylphenyl)-3-phenyl-2,5-dihydrofuran-2-one"
#> [34] "CHEMBL122"
#> [35] "4-(4-methylsulfonylphenyl)-3-phenyl-5H-furan-2-one"
#> [36] "DTXCID903567"
#> [37] "TRM201"
#> [38] "NSC720256"
#> [39] "NSC 720256"
#> [40] "NSC 758705"
#> [41] "2(5H)-Furanone, 4-(4-(methylsulfonyl)phenyl)-3-phenyl-"
#> [42] "NCGC00095118-01"
#> [43] "ROFECOXIB (MART.)"
#> [44] "ROFECOXIB [MART.]"
#> [45] "Vioxx (trademark)"
#> [46] "SMR000466331"
#> [47] "Vioxx (TN)"
#> [48] "SR-01000762904"
#> [49] "3-phenyl-4-(4-(methylsulfonyl)phenyl)-2(5H)-furanone"
#> [50] "Rofecoxib (JAN/USAN/INN)"
#> [51] "Rofecoxib [USAN:INN:BAN]"
#> [52] "Rofecoxib (Vioxx)"
#> [53] "Rofecoxib [USAN]"
#> [54] "MFCD00935806"
#> [55] "KS-1107"
#> [56] "MK 0996"
#> [57] "Spectrum_000119"
#> [58] "ROFECOXIB [INN]"
#> [59] "ROFECOXIB [JAN]"
#> [60] "SpecPlus_000669"
#> [61] "ROFECOXIB [MI]"
#> [62] "ROFECOXIB [HSDB]"
#> [63] "Spectrum2_000446"
#> [64] "Spectrum3_001153"
#> [65] "Spectrum4_000631"
#> [66] "Spectrum5_001598"
#> [67] "ROFECOXIB [VANDF]"
#> [68] "ROFECOXIB [WHO-DD]"
#> [69] "SCHEMBL3050"
#> [70] "BSPBio_002705"
#> [71] "KBioGR_001242"
#> [72] "KBioGR_002345"
#> [73] "KBioSS_000559"
#> [74] "KBioSS_002348"
#> [75] "MLS000759440"
#> [76] "MLS001165770"
#> [77] "MLS001195623"
#> [78] "MLS001424113"
#> [79] "MLS006010091"
#> [80] "BIDD:GT0399"
#> [81] "DivK1c_006765"
#> [82] "SPECTRUM1504235"
#> [83] "SPBio_000492"
#> [84] "3-(4-methanesulfonylphenyl)-2-phenyl-2-buten-4-olide"
#> [85] "GTPL2893"
#> [86] "ROFECOXIB [ORANGE BOOK]"
#> [87] "BDBM22369"
#> [88] "KBio1_001709"
#> [89] "KBio2_000559"
#> [90] "KBio2_002345"
#> [91] "KBio2_003127"
#> [92] "KBio2_004913"
#> [93] "KBio2_005695"
#> [94] "KBio2_007481"
#> [95] "KBio3_002205"
#> [96] "KBio3_002825"
#> [97] "EX-A708"
#> [98] "cMAP_000024"
#> [99] "HMS1922H11"
#> [100] "HMS2051G16"
#> [101] "HMS2089H20"
#> [102] "HMS2093E04"
#> [103] "HMS2232G21"
#> [104] "HMS3371P11"
#> [105] "HMS3393G16"
#> [106] "HMS3651F16"
#> [107] "HMS3713B07"
#> [108] "HMS3750I17"
#> [109] "HMS3885E05"
#> [110] "Pharmakon1600-01504235"
#> [111] "BCP03619"
#> [112] "Tox21_111430"
#> [113] "CCG-40253"
#> [114] "NSC758705"
#> [115] "s3043"
#> [116] "STK635144"
#> [117] "AKOS000280931"
#> [118] "AB07701"
#> [119] "CS-0997"
#> [120] "DB00533"
#> [121] "NC00132"
#> [122] "SB19518"
#> [123] "NCGC00095118-02"
#> [124] "NCGC00095118-03"
#> [125] "NCGC00095118-04"
#> [126] "NCGC00095118-05"
#> [127] "NCGC00095118-08"
#> [128] "NCGC00095118-17"
#> [129] "NCGC00095118-18"
#> [130] "AC-28318"
#> [131] "BR164362"
#> [132] "HY-17372"
#> [133] "NCI60_041175"
#> [134] "SBI-0206774.P001"
#> [135] "CAS-162011-90-7"
#> [136] "NS00003940"
#> [137] "R0206"
#> [138] "SW219668-1"
#> [139] "C07590"
#> [140] "D00568"
#> [141] "AB00052090-06"
#> [142] "AB00052090-08"
#> [143] "AB00052090_09"
#> [144] "AB00052090_10"
#> [145] "EN300-7364304"
#> [146] "L000912"
#> [147] "Q411412"
#> [148] "Q-201676"
#> [149] "SR-01000762904-3"
#> [150] "SR-01000762904-5"
#> [151] "BRD-K21733600-001-02-6"
#> [152] "BRD-K21733600-001-06-7"
#> [153] "BRD-K21733600-001-14-1"
#> [154] "BRD-K21733600-001-15-8"
#> [155] "BRD-K21733600-001-19-0"
#> [156] "BRD-K21733600-001-23-2"
#> [157] "BRD-K21733600-001-24-0"
#> [158] "3-(4-methanesulfonyl-phenyl)-2-phenyl-2-buten-4-olide"
#> [159] "4-(4'-(Methylsulfonyl)phenyl)-3-phenyl-2(5H)-furanone"
#> [160] "Z2037279770"
#> [161] "2(5H)-Furanone, 4-[4-(methyl-sulfonyl)phenyl]-3-phenyl-"
#> [162] "3-(Phenyl)-4-(4-(methylsulfonyl)phenyl)-2-(5H)-furanone"
#> [163] "3-Phenyl-4-(4-(Methylsulfonyl)Phenyl)-2-(5H)-Furanone"
#> [164] "4-(4-METHANESULFONYL-PHENYL)-3-PHENYL-5H-FURAN-2-ONE"
#> [165] "4-(4-methylsulfonylphenyl)-3-phenyl-2,5-dihydro-2-furanone"
5. Cross-References (XRefs): PUG REST provides access to various cross-references. For example, to retrieve MMDB identifiers for protein structures containing aspirin:
result <- get_pug_rest(identifier = "2244", namespace = "cid", domain = "compound", operation = c("xrefs","MMDBID"), output = "JSON")
result
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Compound
#> - Namespace: CID
#> - Operation: xrefs
#> - Identifier: 2244
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
pubChemData(result)
#> $InformationList
#> $InformationList$Information
#> $InformationList$Information[[1]]
#> $InformationList$Information[[1]]$CID
#> [1] 2244
#>
#> $InformationList$Information[[1]]$MMDBID
#> [1] 115960 173465 230639 27242 27954 54234 70578 75951
Or to find all patent identifiers associated with a given SID:
result <- get_pug_rest(identifier = "137349406", namespace = "sid", domain = "substance", operation = c("xrefs","PatentID"), output = "TXT")
result
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Substance
#> - Namespace: SID
#> - Operation: xrefs
#> - Identifier: 137349406
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
pubChemData(result)
#> Text
#> 1 US20030181500A1
#> 2 US20050059720A1
#> 3 US20050124633A1
#> 4 US20050124634A1
#> 5 US20050159403A1
#> 6 US20060008862A1
#> 7 US20060135506A1
#> 8 US20070142442A1
#> 9 US20070179152A1
#> 10 US20070196421A1
#> 11 US20070197957A1
#> 12 US20070198063A1
#> 13 US20070208134A1
#> 14 US20070299043A1
#> 15 US20080125595A1
#> 16 US20090075960A1
#> 17 US20100144696A1
#> 18 US20100144697A1
#> 19 US20100267704A1
#> 20 US20110098273A1
#> 21 US20110177999A1
#> 22 US20110201811A1
#> 23 US7485653
#> 24 US7488721
#> 25 US7585978
These examples illustrate the versatility of PUG REST in fetching specific data efficiently. It’s an ideal tool for users who need quick access to particular pieces of information from the vast PubChem database without the overhead of processing bulk data.
PubChem BioAssays are complex entities containing a wealth of data. PUG REST provides access to both complete assay records and specific components of BioAssay data, allowing users to efficiently retrieve the information they need.
1. Assay Description: To obtain the description section of a BioAssay, which includes authorship, general description, protocol, and data readout definitions, use a request like:
result <- get_pug_rest(identifier = "504526", namespace = "aid", domain = "assay", operation = "description", output = "JSON")
result
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Assay
#> - Namespace: AID
#> - Operation: description
#> - Identifier: 504526
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
pubChemData(result)
#> $PC_AssayContainer
#> $PC_AssayContainer[[1]]
#> $PC_AssayContainer[[1]]$assay
#> $PC_AssayContainer[[1]]$assay$descr
#> $PC_AssayContainer[[1]]$assay$descr$aid
#> id version
#> 504526 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$aid_source
#> $PC_AssayContainer[[1]]$assay$descr$aid_source$db
#> $PC_AssayContainer[[1]]$assay$descr$aid_source$db$name
#> [1] "Southern Research Specialized Biocontainment Screening Center"
#>
#> $PC_AssayContainer[[1]]$assay$descr$aid_source$db$source_id
#> str
#> "RSV_DR6"
#>
#> $PC_AssayContainer[[1]]$assay$descr$aid_source$db$date
#> $PC_AssayContainer[[1]]$assay$descr$aid_source$db$date$std
#> year month day
#> 2012 3 18
#>
#>
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$name
#> [1] "A Cell Based HTS Approach for the Discovery of New Inhibitors of Respiratory syncytial virus (RSV) using synthesized compounds (6)"
#>
#> $PC_AssayContainer[[1]]$assay$descr$description
#> [1] "Southern Research's Specialized Biocontainment Screening Center (SRSBSC)"
#> [2] "Southern Research Institute (Birmingham, Alabama)"
#> [3] "NIH Molecular Libraries Probe Centers Network (MLPCN)"
#> [4] "Assay Provider: Dr. William Severson, Southern Research Institute "
#> [5] "Grant number: 1 R03 MH082403-01A1"
#> [6] ""
#> [7] "Assay Rationale and Summary: Currently, there are no commercially available vaccines to protect humans against Respiratory syncytial virus (RSV). RSV is associated with substantial morbidity and mortality and is the most common cause of bronchiolitis and pneumonia among infants and children under one year of age. Nevertheless, severe lower respiratory tract disease may occur at any age, especially among the elderly or among those with compromised cardiac, pulmonary, or immune systems. The existing therapies for the acute infection are ribavirin and the prophylactic humanized monoclonal antibody (Synagis from MedImmune) that is limited to use in high risk pediatric patients. The economic impact of RSV infections due to hospitalizations and indirect medical costs is greater than $ 650 million annually. The assay provider has developed and validated an HTS assay that measures cytopathic effect (CPE) induced in HEp-2 cells by RSV infection, using a luminescent-based detection system for signal endpoint. We anticipate that the proposed studies utilizing the Molecular Libraries Probes Production Network (MLPCN) HTS resources will generate multiple scaffolds targeting various junctures in the RSV viral lifecycle. These may be furthered developed into probes to construct novel single or combination therapeutics. "
#>
#> $PC_AssayContainer[[1]]$assay$descr$protocol
#> [1] "Cell Culture: HEp-2 cells (ATCC CCL-23, American Tissue Culture Type) were maintained as adherent cell lines in Optimem 1 with 2 mM L-glutamine and 10% fetal bovine serum (FBS) at 37oC in a humidified 5% CO2 atmosphere. Cells were passaged as needed and harvested from flasks using 0.05% trypsin-EDTA. "
#> [2] ""
#> [3] "Assay Media - Preparation of Complete DMEM/F12: 50 mL Pen/Strep/Glutamine (Gibco, Cat. No. 10378) was added to four liters of room temperature DMEM/F12 (Sigma, Cat. No. D6434) and the pH adjusted to 7.5 using 1N NaOH. The medium was sterile filtered through a 0.2 um filter and 10 mL of HI-FBS was added per 500 mL of media."
#> [4] ""
#> [5] "RSV culture: Human respiratory syncytial virus (HRSV) strain Long (ATCC VR-26) was used for screening. The RSV stock was prepared in HEp-2 cells using an initial stock obtained from ATCC. Briefly, HEp-2 cells were grown in two T-175 flasks to 50% confluence in Dulbecco's Modified Eagle Medium: Nutrient Mixture F-12 (CDMEM/F12), pH 7.5 with 2.5 mM L-glutamine, 2% FBS and 125 U of penicillin, 125 ug of streptomycin per ml. 0.2 ml of RSV was added to 25 ml of CDMEM/F12. After three days incubation at 37 degrees C, 5% C02 and high humidity, the supernatant was harvested and the cell debris pelleted by centrifuging at 1,000 rpm for 5 minutes at 18 degrees C. Trehalose and FBS were added to a final concentration of 10% each and the supernatant was aliquoted (1 ml per tube) fast freeze using 100% Ethanol dry ice for five minutes and stored at -80 degrees C. These virus stocks were titrated in HEp-2 cells using an agarose overlay plaque method and the titer was 6.0E+07 pfu/ml. "
#> [6] ""
#> [7] "Dose Response Compound Preparation: For dose response screening, compounds or carrier control (DMSO) were diluted to 6x in Complete DMEM/F12. Test compounds were serially diluted 1:2 resulting in an 8 point dose response dilution series. (final plate well concentration ranging from 50 uM to 0.39 uM and a final DMSO concentration of 0.5%). Twenty ul of each dilution was dispensed to assay plates (3% DMSO) in triplicate."
#> [8] "Control Drug: The positive control drug for this assay, ribavirin [1] (No. 196066, MP Biomedicals, Solon, OH) was solubilized in DMSO. It was diluted and added to the assay plates as described for test compounds. Final concentration for ribavirin was 100uM. All wells contained 0.5% DMSO."
#> [9] ""
#> [10] "Preparation of HEp-2 cells: Cells were harvested and resuspended to 178,000 cells per ml in Complete DMEM/F12."
#> [11] ""
#> [12] "Assay Set up: Sixty ul of HEp-2 cells (10,595 cells/well) forty ul of media and 20 ul of 3% DMSO were plated in the cell control wells. Sixty ul of HEp-2 cells (10,595 cells/well), forty ul of a 1:500 dilution of virus (viral MOI = 0.45) and 20 ul of compound or ribavirin control drug were added to the virus control and compound wells. All cell plating was conducted using a Matrix WellMate and cells were maintained at room temperature with stirring during the plating process. The assay plates were incubated for six days at 37 degrees C, 5% CO2 and 90% relative humidity. "
#> [13] ""
#> [14] "Data Analysis: Eight control wells containing cells only and four wells containing cells and virus were included on each assay plate and used to calculate Z' value for each plate and to normalize the data on a per plate basis. Results are reported as percent (%) CPE inhibition and were calculated using the following formula: % CPE inhibition = 100*(Test Cmpd - Med Virus)/(Med Cells - Med Virus). Four ribavirin positive control wells were included on each plate for quality control purposes. To quantify the viral cytopathic effect, IC50s were calculated for each substance using the 4 parameter Levenburg-Marquardt algorithm with the minimum and maximum parameters locked at 0 and 100, respectively. "
#>
#> $PC_AssayContainer[[1]]$assay$descr$comment
#> [1] "Possible artifacts in this assay include, but are not limited to, compounds that interfere with the luciferase reaction, absorb luminescence, or precipitate."
#> [2] ""
#> [3] "Compounds that demonstrated at least 50% inhibition and were considered active. "
#> [4] ""
#> [5] "The following tiered system has been implemented at Southern Research Institute for use with the PubChem Score. Compounds in the primary screen are scored on a scale of 0-40 based on inhibitory activity where a score of 40 corresponds to 100% inhibition. In the initial confirmatory dose response screen, active compounds were scored on a scale of 41-80 based on the IC50 result while compounds that did not confirm as actives were given the score of 0. In assays using purified and synthesized compounds a scale of 81-100 based on the IC50 result is used to denote a high level of confidence in both the substance and the data. Compounds that did not confirm activity were given the score of 0."
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref
#> $PC_AssayContainer[[1]]$assay$descr$xref[[1]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[1]]$xref
#> aid
#> 2391
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[1]]$comment
#> [1] "Primary and confirmatory screen."
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[2]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[2]]$xref
#> aid
#> 2410
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[2]]$comment
#> [1] "Cytotoxicity of primary screen hits."
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[3]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[3]]$xref
#> aid
#> 449732
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[3]]$comment
#> [1] "TCID50 on selected compounds."
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[4]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[4]]$xref
#> aid
#> 488972
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[4]]$comment
#> [1] "Confirmatory screen (2) on purified and synthesized compounds."
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[5]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[5]]$xref
#> aid
#> 488976
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[5]]$comment
#> [1] "Cytotoxicity screen (2) on purified and synthesized compounds."
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[6]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[6]]$xref
#> aid
#> 2440
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[6]]$comment
#> [1] "Summary AID."
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[7]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[7]]$xref
#> aid
#> 492966
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[7]]$comment
#> [1] "Confirmatory screen (3) on purified and synthesized compounds."
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[8]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[8]]$xref
#> aid
#> 492968
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[8]]$comment
#> [1] "Cytotoxicity screen (3) on purified and synthesized compounds."
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[9]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[9]]$xref
#> aid
#> 493016
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[9]]$comment
#> [1] "Confirmatory screen (4) on purified and synthesized compounds."
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[10]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[10]]$xref
#> aid
#> 493015
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[10]]$comment
#> [1] "Cytotoxicity screen (4) on purified and synthesized compounds."
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[11]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[11]]$xref
#> aid
#> 493088
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[11]]$comment
#> [1] "Confirmatory screen (5) on purified and synthesized compounds."
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[12]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[12]]$xref
#> aid
#> 493090
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[12]]$comment
#> [1] "Cytotoxicity screen (5) on purified and synthesized compounds."
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$xref[[13]]
#> $PC_AssayContainer[[1]]$assay$descr$xref[[13]]$xref
#> taxonomy
#> 12814
#>
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$results
#> $PC_AssayContainer[[1]]$assay$descr$results[[1]]
#> $PC_AssayContainer[[1]]$assay$descr$results[[1]]$tid
#> [1] 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[1]]$name
#> [1] "IC50 Modifier"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[1]]$type
#> [1] 4
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[1]]$unit
#> [1] 254
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[2]]
#> $PC_AssayContainer[[1]]$assay$descr$results[[2]]$tid
#> [1] 2
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[2]]$name
#> [1] "IC50"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[2]]$type
#> [1] 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[2]]$unit
#> [1] 5
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[2]]$ac
#> [1] TRUE
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[3]]
#> $PC_AssayContainer[[1]]$assay$descr$results[[3]]$tid
#> [1] 3
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[3]]$name
#> [1] "% CPE Inhibition @ 50 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[3]]$description
#> [1] "Inhibition of cytopathic effect of virus at 50 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[3]]$type
#> [1] 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[3]]$unit
#> [1] 15
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[3]]$tc
#> concentration unit dr_id
#> 50 5 1
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[4]]
#> $PC_AssayContainer[[1]]$assay$descr$results[[4]]$tid
#> [1] 4
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[4]]$name
#> [1] "% CPE Inhibition @ 25 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[4]]$description
#> [1] "Inhibition of cytopathic effect of virus at 25 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[4]]$type
#> [1] 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[4]]$unit
#> [1] 15
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[4]]$tc
#> concentration unit dr_id
#> 25 5 1
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[5]]
#> $PC_AssayContainer[[1]]$assay$descr$results[[5]]$tid
#> [1] 5
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[5]]$name
#> [1] "% CPE Inhibition @ 12.5 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[5]]$description
#> [1] "Inhibition of cytopathic effect of virus at 12.50 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[5]]$type
#> [1] 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[5]]$unit
#> [1] 15
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[5]]$tc
#> concentration unit dr_id
#> 12.5 5.0 1.0
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[6]]
#> $PC_AssayContainer[[1]]$assay$descr$results[[6]]$tid
#> [1] 6
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[6]]$name
#> [1] "% CPE Inhibition @ 6.25 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[6]]$description
#> [1] "Inhibition of cytopathic effect of virus at 6.25 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[6]]$type
#> [1] 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[6]]$unit
#> [1] 15
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[6]]$tc
#> concentration unit dr_id
#> 6.25 5.00 1.00
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[7]]
#> $PC_AssayContainer[[1]]$assay$descr$results[[7]]$tid
#> [1] 7
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[7]]$name
#> [1] "% CPE Inhibition @ 3.125 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[7]]$description
#> [1] "Inhibition of cytopathic effect of virus at 3.125 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[7]]$type
#> [1] 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[7]]$unit
#> [1] 15
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[7]]$tc
#> concentration unit dr_id
#> 3.125 5.000 1.000
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[8]]
#> $PC_AssayContainer[[1]]$assay$descr$results[[8]]$tid
#> [1] 8
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[8]]$name
#> [1] "% CPE Inhibition @ 1.563 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[8]]$description
#> [1] "Inhibition of cytopathic effect of virus at 1.563 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[8]]$type
#> [1] 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[8]]$unit
#> [1] 15
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[8]]$tc
#> concentration unit dr_id
#> 1.563 5.000 1.000
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[9]]
#> $PC_AssayContainer[[1]]$assay$descr$results[[9]]$tid
#> [1] 9
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[9]]$name
#> [1] "% CPE Inhibition @ 0.781 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[9]]$description
#> [1] "Inhibition of cytopathic effect of virus at 0.781 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[9]]$type
#> [1] 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[9]]$unit
#> [1] 15
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[9]]$tc
#> concentration unit dr_id
#> 0.781 5.000 1.000
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[10]]
#> $PC_AssayContainer[[1]]$assay$descr$results[[10]]$tid
#> [1] 10
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[10]]$name
#> [1] "% CPE Inhibition @ 0.391 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[10]]$description
#> [1] "Inhibition of cytopathic effect of virus at 0.391 uM"
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[10]]$type
#> [1] 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[10]]$unit
#> [1] 15
#>
#> $PC_AssayContainer[[1]]$assay$descr$results[[10]]$tc
#> concentration unit dr_id
#> 0.391 5.000 1.000
#>
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$revision
#> [1] 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$activity_outcome_method
#> [1] 2
#>
#> $PC_AssayContainer[[1]]$assay$descr$dr
#> $PC_AssayContainer[[1]]$assay$descr$dr[[1]]
#> $PC_AssayContainer[[1]]$assay$descr$dr[[1]]$id
#> [1] 1
#>
#> $PC_AssayContainer[[1]]$assay$descr$dr[[1]]$descr
#> [1] "CR Plot Labels 1"
#>
#> $PC_AssayContainer[[1]]$assay$descr$dr[[1]]$dn
#> [1] "Concentration"
#>
#> $PC_AssayContainer[[1]]$assay$descr$dr[[1]]$rn
#> [1] "Response"
#>
#>
#>
#> $PC_AssayContainer[[1]]$assay$descr$grant_number
#> [1] "1R03 MH084847-01"
#>
#> $PC_AssayContainer[[1]]$assay$descr$project_category
#> [1] 2
For a simplified summary format that includes target information, active and inactive SID and CID counts:
result <- get_pug_rest(identifier = "1000", namespace = "aid", domain = "assay", operation = "summary", output = "JSON")
result
#>
#> An object of class 'PugRestInstance'
#>
#> Request Details:
#> - Domain: Assay
#> - Namespace: AID
#> - Operation: summary
#> - Identifier: 1000
#>
#> NOTE: Run getter function 'pubChemData(...)' to extract raw data retrieved from PubChem Database.
#> See ?pubChemData for details.
pubChemData(result)
#> $AssaySummaries
#> $AssaySummaries$AssaySummary
#> $AssaySummaries$AssaySummary[[1]]
#> $AssaySummaries$AssaySummary[[1]]$AID
#> [1] 1000
#>
#> $AssaySummaries$AssaySummary[[1]]$SourceName
#> [1] "SRMLSC"
#>
#> $AssaySummaries$AssaySummary[[1]]$SourceID
#> [1] "SPE-MK-Sec"
#>
#> $AssaySummaries$AssaySummary[[1]]$Name
#> [1] "Screening for Inhibitors of the Mevalonate Pathway in Streptococcus Pneumoniae - MK Secondary Assay"
#>
#> $AssaySummaries$AssaySummary[[1]]$Description
#> [1] "Southern Research Molecular Libraries Screening Center (SRMLSC) "
#> [2] "Southern Research Institute (Birmingham, Alabama) "
#> [3] "NIH Molecular Libraries Screening Centers Network (MLSCN) "
#> [4] "Assay Provider: Dr. Thomas S. Leyh, Albert Einstein College of medicine of Yeshiva University "
#> [5] "Award: R03 MH078936-01"
#> [6] ""
#> [7] "Streptococcus pneumonia (SP) takes the lives of nearly 4,000 people daily and antibiotic resistant strains are becoming an increasing problem. Because of this, the discovery of drugs targeting novel pathways such as the mevalonate pathway has become increasingly important. The pathway produces isopentyl diphosphate (the molecular building block of isoprenoids) and is essential for the survival of the pathogen in mouse lung and serum. The mevalonate pathway is comprised of three consecutive reactions that are catalyzed by the enzymes mevalonate kinase (MK; E.C. 2.7.1.36), phosphomevalonate kinase (PMK; E.C. 2.7.4.2), and diphosphomevalonate decarboxylase (PDM-DC; E.C. 4.1.1.33). MK catalyzes the ATP dependent conversion of (R)-mevalonate to ADP plus (R)-5-phosphomevalonate. "
#> [8] ""
#> [9] "The activity of MK was measured spectrophotometrically by coupling the formation of ADP to the reactions of pyruvate kinase and lactate dehydrogenase. The rate of ADP formation was quantitated by the reduction of absorbance (OD340) due to the oxidation of NADH to NAD by lactate dehydrogenase. A kinetic assay was chosen to minimize interference by compounds that absorbed at 340 nm."
#> [10] ""
#> [11] "A total of 57 compounds were initially screened at a final concentration of 100 uM. Compounds with more than 20% inhibition at 100 uM were then tested in dose response assays at seven concentrations, ranging from 100 uM to 0.0156 uM depending on the % inhibition for each compound in the initial 100 uM screen. To confirm that the compounds were specifically inhibiting MK and not one of the other enzymes in the assay, the compounds were tested in parallel in an assay that contained hexokinase as the enzyme instead of MK and glucose as the substrate instead of mevalonate. None of the compounds tested inhibited in this 'coupling enzymes' assay."
#>
#> $AssaySummaries$AssaySummary[[1]]$Protocol
#> [1] "Mevalonate Kinase Protocol for 1 mL cuvet assay"
#> [2] ""
#> [3] "Purified recombinant MK enzyme was provided by Dr. Thomas Leyh, Albert Einstein College of Medicine of Yeshiva University. "
#> [4] ""
#> [5] "590 uL of MK reagent mix which included NADH, mevalonate, ATP, phosphoenolpyruvate, MgCl2, KCl, pyruvate kinase, and lactate dehydrogenase in buffer was added to each UV-VIS semimicro cuvet. Compounds were then added to the cuvets in 10 uL volumes in DMSO. The reaction was initiated with the addition of 400 uL of MK, diluted in assay buffer. The final concentrations in the reaction were 0.4 mM NADH, 0.2 mM melalonate, 6 mM MgCl2, 50 mM KCl, 5 mM ATP, 2 mM potassium phosphoenolpyruvate, 3 units/mL each of rabbit muscle pyruvate kinase and rabbit muscle lactate dehydrogenase, 1% DMSO, and 60 nM Mevalonate Kinase (MK) diluted in 50 mM HEPES buffer (pH 7.8) in a final volume of 1000 uL. The cuvets were immediately transferred to a CARY 3E UV-Visible Spectrophotometer and absorbance was measured at 340 nm for 3 minutes. Each compound was tested in triplicate for the initial 100 uM screen and for the dose response assays each concentration was in triplicate. Full reaction controls were assays with 10 uL of DMSO added instead of compound. Background controls were assays in which 400 uL of assay buffer was added instead of MK. "
#> [6] ""
#> [7] "The 'coupling enzyme' assays were performed as above by replacing MK with hexokinase and mevalonate with glucose."
#>
#> $AssaySummaries$AssaySummary[[1]]$Comment
#> [1] "Percent inhibition in the single dose assay was determined by comparing the reaction rates of the compounds to that of the rate for the full reaction after subtracting the background rate u