Chemical Literature by Dr. Adrian Culf

[revised January 25, 2010 by Brian McNally)

All discoveries made in the laboratory must be published somewhere if the information is to be known – “is known” or “has been done” really mean “has been published”.The chemical literature is both vast and complex. I hope to give you some basic tools and concepts to make more efficient use of the literature and computerized databases. The ability to use the scientific literature is a necessary requirement of the practising chemist. These are the dominant tools for both current awareness and in-depth literature searching.

In many cases, it will turn out that the best single resource for your topic will be a search of the Chemical Abstracts files via SciFinder Scholar. Be sure to tap the skills of the librarians to suggest the best approaches for your search.

 Why spend time talking about the chemical literature?

  • Because the subject is HUGE...
    • Chemistry strictly defined is large, and it overlaps into physics, biology, medicine, pharmaceutics, geology, materials engineering, forensics, etc.
    • In many areas of chemistry, notably synthesis, the older literature is as relevant as the newest literature. 19th century journals are still consulted for synthesis work!
    • In many areas of chemistry, the patent literature is as important as the more familiar journal literature.
  • Because the subject is COMPLEX...
    • Chemists are interested in information which cannot be readily defined merely by key words, such as ranges of numeric data, sets of substances with particular structural features, or macromolecules (both biomolecules and synthetics) with particular sequences of structural units.
    • The terminology of chemistry, especially chemical nomenclature, is incredibly complex!
    • The patent segment of the literature is often written in terminology obscure even to trained chemists.
  • Because the tools available for chemists are RAPIDLY EVOLVING...
    • Only a few years ago, there was very little on the Internet of interest to chemists. Now, traditional journals and databases have been reinvented for the World Wide Web, and new resources have sprung up.

The chemical researcher can benefit from learning how to search chemical information and how it is organized.

 Types of scientific literature

  • PRIMARY - The original publication of data: journals, patents, technical reports, conferences, dissertations, preprints, some books. Journals as we know them today have been published for about 150 years.
  • SECONDARY - Publications which provide access to the primary literature: reviews, indexes, abstracts, data collections, book series, textbooks, etc.

Approaches to organizing the scientific literature

  • Classification and Data Collection - physically grouping related data by some common element.
  • Indexing - creating pointers to the original literature based on some piece of information in the original, e.g. author names or subject terms.

Classification & Data Collection

  • Libraries use classification schemes to group related books together for browsing by subject. In the Library of Congress system, chemistry materials fall under QD. This system of classification is used at Mount Allison University.
  • Data collections bring information from various primary sources for easier location, eg. the CRC handbook series. 

 Here are the key steps for intelligent, methodical information seeking:

  • Define what you're looking for; determine what information will satisfy your needs.
  • Determine what you already know - a subject term, an author, a known reference - that can serve as a starting point for your search.
  • Decide which tools can best find answers based on your initial information. What's "best" will vary not only with the problem at hand, but with the available resources at that time and place. For example, Mount Allison University only has a small subset of the available tools and resources for chemistry.
  • Find an initial set of "hits" and select the most relevant ones.
  • Decide if these answers satisfy your need for information. If not...
  • Review those answers for new clues - terms, authors, cited references, citing references, etc.
  • From these, repeat the cycle until satisfied.

Defining the Problem

  • Intellectual scope: The needs of researchers for information can range from a single datum, say, the melting point of a known compound, to a comprehensive review of the literature for the best methods for a new synthesis of a complex natural product, to a patentability search which tries to demonstrate that a new invention has never been previously reported in any published literature. The searcher should decide in advance just how comprehensive the search must be.
  • Chronological scope: Based on your knowledge of the field, how far back into the published literature will you need to look to satisfy your needs? Research on nanodevices doesn't need to go back more that a few years, for example, but searches for synthetic methods, or natural products may need to delve as far back as sources allow.

If you have at least one document in hand that is pertinent to your search, you have a source of numerous possible leads:

    • Its authors may have published additional material on the topic. You can use their names for author searches.
    • The author(s)'s address(es) may point you to other research done at their institution or company. You may even want to contact the author(s) directly and pose your questions to them.
    • The text of the document may yield additional search terms or synonyms for the ones you already have.
    • You can look up the document in an appropriate index, and see if it leads to search terms used by that database. Patents may yield national or international classification codes which you can use for further searching.
    • Most scholarly papers and patents have a list of references or a bibliography. These can lead you to older but still relevant documents useful to your search. Some indexes let you find documents which cite some of the same references.
    • You can use the document as a starting point by finding indexes which allow you to locate the more recent papers which cite your starting document.
    • In the chemical literature, you will frequently find relevant chemical names, biosequences (e.g. proteins and polynucleotides), chemical identification numbers, structure diagrams and reaction diagrams. All of these can be starting points for searches in appropriate databases. 

Search and Select

  • Always remember that no single search will find every possible reference on a topic (if the topic is at all complex) and that searches will rarely give a set of results without any irrelevant answers.
  • There are always trade-offs between getting ALL relevant answers and getting ONLY relevant answers.
  • Ultimately, you will have to evaluate the answers you find, both to weed out the irrelevant from the relevant, and to decide if you have answered your original question adequately. You may find that your original question wasn't exactly what you needed and need to revise it.

If You're Not Satisfied...

  • Assuming that you found at least some useful sources, now examine them for new "clues" to deepen your search. Each source you found will yield more potential starting points for your search. This is referred to as the Iterative Approach to literature searching.

 Arrangement of Materials

  • The Mount Allison Library follows the Library of Congress classification system.
    • The first group of letters signifies the broad subject area.
    • The first group of numbers signifies the more specific subject area.
    • The subsequent letters and numbers identify the individual book, and are usually based on the author's name and/or book's title.
  • "Traditional" subject areas are well grouped:
    • QD = chemistry
    • QD 241-449 = organic chemistry
    • QD 380-388 = organic polymer chemistry
    • QD 410-413 = organometallic chemistry
    • QD 415-449 = biological chemistry  
    • QC 450-499 = spectral analysis
    • QP 501-801 = biochemistry
    • RS = pharmacy
    • TP = chemical technology

Data Collections

  • These are a form of secondary literature in which an editor selects information from primary sources and arranges it to facilitate a particular type of access.
  • Often, the data are reviewed and evaluated by the editors before inclusion, adding further value.
  • The right data collection can be more useful than searching primary sources, depending on the objective of your search.

Types of data collections

  • Dictionaries
    eg. Merck Index
  • Encyclopedias
    eg. Kirk-Othmer 
  • Physical data collections (including spectra collections)
  • Reaction and synthesis guides
    These may collect preparations of individual compounds, applications of individual reagents or general methods grouped by type of reaction, type of starting material or type of product.
  • Analytical methods guides
    These may deal with specific or general techniques, grouped by analyte, matrix, or method.
  • Comprehensive works
    These are usually ongoing series, attempting to summarize all of a given area of chemistry. Good examples include the Beilstein Handbook of Organic Chemistry and the Gmelin Handbook of Inorganic and Organometallic Chemistry. 

Examples of Frequently Used Data Books in Chemistry (Mt.A.)

Secondary Sources

Journal articles, patents and the Internet contain virtually all the original work in organic chemistry. However, if this were all – if there were no indices, guides, review articles and other, secondary, sources – the literature would be unusable because it is so vast no one could hope to find anything in particular. Fortunately, there are secondary sources:

CRC Handbook of Chemistry and Physics (REFERENCE QD 65 .H3)

  • Familiar source; published annually but usually changes little from one year to the next.
  • Variety of useful physical and chemical data, with some references. Tables are grouped in broad subject sections. Arrangement within tables varies.
  • Most frequently used for tables of organic compounds and inorganic compounds, which contain data on melting points, boiling points, density and solubility among others.
  • Note that both tables have synonym indexes following the table.
  • Not very systematic in choice of data, and indexing can be inconsistent.
  • Provide CAS RN’s

Merck Index (REFERENCE RS 51 .M4)

  • Published by Merck Pharmaceuticals, with data primarily on organics, strongest on medicinal compounds.
  • Includes physical data, preparation references, toxicity, and uses.
  • Arranged alphabetically by chemical name; structure is given; well-indexed; updated irregularly.
  • The Merck Index is now available in a Web version from CambridgeSoft (see http://products.cambridgesoft.com/themerckindex.cfm.
  •  Provide CAS RN’s

Aldrich Catalogue

  • Includes basic physical data, cross-references to Beilstein, Merck and Fieser, and safety information.
  • Arranged alphabetically, with indexes by molecular formula and CAS Registry Number.
  • The combined Aldrich and Sigma chemical catalogs are searchable on the Web at http://www.sigmaaldrich.com/
  • Note that in both print and online versions, a single compound may appear in a number of different product records, usually representing various grades of purity or source. Note also, physical property data is usually only listed for the highest grade version of a given compound.
  •  Provide CAS RN’s

Encyclopedia of Chemical Technology

  • Commonly referred to as "Kirk-Othmer" after its early editors.
  • Wide-ranging, authoritative encyclopedia of chemical and process information
  • 4th edition is now complete in 25 volumes plus supplement and index; 3rd and earlier editions still useful. Wiley is now releasing a 5th edition. Very strong on industrially important chemicals.
  • Good subject indexing, cross-references and bibliographies.
  • Wiley has made this encyclopedia available in a browsable and searchable Web version at http://www.mrw.interscience.wiley.com/kirk/.

Rodd's Chemistry of Carbon Compounds (QD 251 .R6 1964)

  • Series of review volumes with ongoing supplements on organic compounds.
  • Organized by chemical class; 1st by acyclic, alicyclic, aromatic, or heterocyclic, then by other structural features.
  • Very good on reactions & biochemicals
  • Best all-English handbook of organic chemical information.

Organic Reactions (QD 251 .O7)

  • Annual publication with review articles on important synthetic methods.
  • Articles are published in no particular order, but the series is well-indexed, with cumulative author and chapter/topic indexes in each volume for all the preceding volumes.
  • Wiley is releasing an electronic version on the Web (see: http://www3.interscience.wiley.com/cgi-bin/mrwhome/107610747/HOME. It currently includes volumes 48-62 of the printed work.

Organic Syntheses (QD 262 .O7)

  • Annual publication with tested syntheses of organic and organometallic compounds.
  • Gives detailed descriptions of synthetic techniques, reagents, yields and safety aspects.
  • Well-indexed (authors, compound names, reaction types, molecular formulae)
  • Collective volumes include revised and updated syntheses from annual volumes. There is a cumulative index for the first eight collective volumes.
  • The publishers, in collaboration with Wiley and CambridgeSoft, has released a FREE Web version at http://www.orgsyn.org/ With a free chemical drawing plug-in available at the Web site, the online version is substructure searchable.
  • Wiley has also released a somewhat more up-to-date subscription version. Note that article in this Wiley reference work (and many others) are available on a pay-per-view basis to individual users.

Fieser and Fieser's Reagents for Organic Synthesis (QD 262 .F5)

  • Classic series reporting on new reagents and new uses for old reagents.
  • Published less-than-annually.
  • Alphabetical list of reagents, with author and subject index.
  • Cumulative index for Vols. 1-12.

Integrated Spectral Data Base System (SDBS)

http://www.aist.go.jp/RIODB/SDBS/menu-e.html
This site, from the National Institute of Materials and Chemical Research in Japan, contains full spectra and, in many cases, peak assignments for about 31,000 compounds, including about 21,800 mass spectra, 12,000 13C NMR, 13,900 proton NMR, 49,000 IR, 3,500 Raman and 2,000 ESR spectra. The database is searchable by compound name, CAS Registry Number, molecular formula and NMR or IR peaks. The database is free to the public, but users are asked to download no more than 50 spectra per day without specific permission of the site owners.

NIST Chemistry Webbook

http://webbook.nist.gov/
Among other data, NIST Chemistry Webbook has IR spectra for over 16,000 compounds, mass spectra for over 15,000 compounds, UV/visible spectra for over 400 compounds and electronic spectra for over 4000 compounds which may be searched in a variety of ways, displayed and printed. Note that the variety of data available here is growing; well-worth checking for a wide variety of data. The Webbook may also be searched by keyword, property or chemical name along with a large number of NIST databases at the NIST Data Gateway. http://srdata.nist.gov/gateway/.

Aldrich Library of Infrared Spectra, 3rd. ed. (QD 96 .I5 P67 1981)
Sadtler Guide to Carbon 13 NMR Spectra (Ref QC 762 .S28 1983)
Aldrich Library of NMR Spectra (3 volumes,
QD 96 .N8 P68 1983)
Aldrich Library of 13C and 1H FT NMR Spectra (2 volumes, Ref QD 96 .F68 P67 1993) 

 Comprehensive works: Beilstein and Gmelin

  • These two series represent attempts to gather into data collections every piece of significant information in the organic and inorganic literature, respectively.
  • While both began in the 1800's in Germany, they take significantly different approaches to organizing their data.

Beilstein Handbook of Organic Chemistry (QD 251 .B4 1918- INCOMPLETE)

  • Begun in 1880's by Friedrich Konrad Beilstein.
  • Went through three complete editions before the current series began in 1918.
  • Most of Beilstein is in German; the current (5th) supplement is in English.

Primary Literature: Publication of Information

The major forms of primary scientific publication include:

Scientific Journals

The scientific journal was invented in the mid-1600's as a means of speeding scholarly communication: Philosophical Transactions of the Royal Society. As science grew, so did the volume of literature and the specialization of journals. Today there are over 100,000 scientific journals.

Types of Journals

Journals vary widely in degree of specialization, from

Type of Article

Journals vary in types of articles:

  • News and reviews: Chemistry World (RSC); Chemical & Engineering News (ACS)
    These magazines specialize in short summaries of "hot" current research, usually in language aimed at the non-specialist, often written by professional journalists (with some scientific background) rather than by professional scientists.
  • Major reviews: Accounts of Chemical Research; Chemical Society Reviews
    These journals specialize in longer articles summarizing the research in a particular field, usually over a specified chronological range. These are generally written by scientists who are expert in the field. An intensive survey of a rather narrow field of study (eg. “The use of chemically modified RNA in breast cancer gene silencing”.) A good review article is of enormous value as it represents a thorough survey of all the work done in the field under discussion.
  • Major original papers: Dalton Transactions; Tetrahedron
    These journals (the majority of scholarly journals) carry full-length articles on original research.
  • Brief communications: Chemical Communications; Organic Letters, Rapid Communications in Mass Spectrometry
    Some journals specialize in rapid publication of short announcements of research results.

Some major journals of interest to organic chemists include:

From the American Chemical Society:

Accounts of Chemical Research (1968) – reviews

Aldrichimica Acta (1968) – online reviews at Aldrich/Sigma

Biochemistry

Chemical Reviews (1924) - reviews

Journal of the American Chemical Society (1879)

Journal of Medicinal Chemistry

Journal of Organic Chemistry (1936)

Heterocycles (1973)

From the Royal Society of Chemistry:

Chemical Communications (1965)

Chemical Society Reviews (1972) - reviews

Organic & Biomolecular Chemistry (formerly Perkin Transactions I, (1841), Journal of the Chemical Society)

Russian Chemical Reviews

From Elsevier publishers:

Tetrahedron Letters (1959)

Tetrahedron (1958) -reviews

 

Others:

Angewandte Chemie, Intl. Ed. Engl. (1962) - reviews

Canadian Journal of Chemistry (1929)

 

Some specialist journals that I look at for my research:

Bioconjugate Chemistry

Nucleic Acids Research

RNA

Journal of Heterocyclic Chemistry

Heterocycles (1973)

Nucleosides, Nucleotides & Nucleic Acids

Molecular Cancer Therapeutics

Cancer Gene Therapy

Drug Discovery & Development

Drug Discovery Today

Genomics & Proteomics

Journal of Peptide Research

Protein & Peptide Letters

Analytical Biochemistry

Journal of Nuclear Medicine

Journal of Labelled Compounds and Radiopharmaceuticals

Journal of Nuclear Medicine and Biology

Journal of Applied Radiation and Isotopes

European Journal of Nuclear Medicine

Journal of Fluorine Chemistry

Structure of a Journal Article

  • The structure will vary with the type of article (see above), but a typical full research article will include:
    • Bibliographic information (article title, authors, author addresses (usually includes e-mail for the corresponding author)
    • Abstract (may include keywords)
    • Introduction
    • Experimental section; depending on the topic, these may be reagents and reactions, or computational methods.
    • Results and Discussion
    • Conclusions
    • Acknowledgments
    • References
    • The article may have supplementary or supporting information online.
  • The precise structure required will vary somewhat from journal to journal; nearly all use an "Instructions to Authors" section on the journal home page to inform prospective authors of their requirements, including methods of submission, article format, citation styles for references, etc
  • An increasing number of journals have some form of alerting service, which allows the user to receive e-mail notice
  • While maintaining a large, sophisticated e-journal site is not cheap, it is possible to publish relatively inexpensively on the Web. This has lead to such phenomena as Web-only journals.

Peer Review

  • The majority of scientific journals publish peer-reviewed articles, also called refereed articles.
  • In these journals, the editor sends submitted articles out to persons expert in the field of the article.
  • The referee comments on the article and the research it presents.
  • The editor then decides whether to accept the article as is, send it back to the author for revision, or reject it outright.
  • Reviewing helps uphold scientific standards, but it adds to the delay between research and publication - often a year between submission and publication.
  • Note that electronic processing methods, such as e-mail of manuscripts between authors, editors and referees is speeding up the process.

Conference Papers

  • Papers presented at a conference are often the fastest way of publishing hot new information.
  • But conference papers are often hard to locate in print; indexing can be slow and may not be refereed.
  • Conference papers in chemistry are infrequently available in electronic form. Searching general web databases like Google may yield conference papers which have been mounted on the Web by their authors. The American Chemical Society publishes selected conference proceedings in book form as part of the ACS Symposium Series.
  • Papers may be published as part of a journal, as a special monograph, or as part of a monographic series. 

Open Access: The Buzzword in Scientific Publishing

  • A discussion among scientists, funding agencies, publishers and information professionals is who should pay for access to scientific information. Some now advocate the notion of open access - the idea that scientific research should be made public without cost to its readers.
    • open access journals - Continue the traditional journal structure, but support the journals through some means other than subscription fees.
  •  It is usually proposed that the cost of publication be shifted from the subscriber to the authors of the papers, who, in turn, will pass the costs on to their institutions or funding agencies. BioMedCentral at http://www.biomedcentral.com/home/ is a new publishing service established in recent years which has launched over 100 new titles, mainly in biomedicine, and mainly open access. They charge authors from $620 to $1570 per article depending on the journal. 
  • Some journals have tried to compromise by making their backfiles available free of charge, while continuing to charge subscription fees for the most recent issues.

Intellectual Property

Intellectual property is the legal concept that one can own the products of one's intellectual labour, such as inventions, prose, poetry and so forth. By enacting intellectual property law, governments can provide inventors, authors and artists with a legal monopoly to profit from their works. In Canadian (& American) law, intellectual property is of four types:

  • Copyrights apply to the expression of an idea - literature, art, music...or software.
  • Trademarks and Service Marks cover the recognizable symbols of a company, organization or product.
  • Patents cover tangible inventions. See: What Every Chemist Should Know About Patents(http://www.chemistry.org/portal/resources/ACS/ACSContent/government/publications/Chem_patent2001.pdf)
  • Trade secrets are undisclosed inventions; theft is illegal, but...there is nothing to prevent a competitor from "reverse engineering" the product.
  • Patents are a monopoly on the manufacture and sale of inventions granted by a government in return for the publication of the details of the invention.
  • Patents may be assigned by the inventor to another person or corporation.
  • Patents are the most important form of publication for industrial research. 

Patents as information sources

Patents are:

  • sources of legal information - who owns the right to manufacture a given invention in a given country.
  • sources of business information - competitive intelligence - What companies are working in a given field? Who are the prime inventors or experts in a field?
  • sources of technical information - they give the necessary information to replicate an invention. 

What may be patented?

  • Machines - includes means of production and consumer goods.
  • Manufactures - mainly consumer goods
  • Designs - e.g. packaging, decoration
  • Plants - agriculture, horticulture
  • Processes - including chemical ones
  • Compositions of matter - ie. chemical substances

Requirements for patentability

  • Novelty - The invention must be "new"; not existing in "prior art".
  • Unobviousness - The invention must not be obvious to an observer "skilled in the art".
  • Utility - The invention must be useful. You can't patent a compound; only a use for a compound.

Sources of Patent Information

Several Web patent databases are available:

The US Patent and Trademark Office has its own bibliographic database at http://patents.uspto.gov/. This site has full text and page images back to 1976. Patents from 1790 to 1975 may be searched by patent number and subject classification only, and displayed as page images. Note: Images are currently in Quick Time Image (TIFF) format and require a special viewer which can be downloaded free of charge.

The European Patent Office has Esp@cenet,  at http://ep.espacenet.com/ which allows searching of European, WIPO, Japanese, and worldwide patents in general. Fulltext of patents is available free online for the last ten years. Note that the full text is in a format which is difficult to print (A4 size paper). Earlier years are stored offline and may be ordered.

Patents are not usually as reliable as journal papers as a source of information. There are two main reasons for this: It is in the interest of the inventor to claim as much as possible. For this reason, it is wise to only follow the examples given and not the claims if you are doing synthesis. Also, as it is up to the assignee to protect the patent from infringement some important pieces of information are withheld and the language (“legalese”) is very difficult to follow.

Personal Communications - "The Invisible College"

  • While, technically speaking, personal communications between researchers may not be publications themselves, they are frequently cited in journal articles and elsewhere.
  • Networking between scientists in a given field can be extremely important. This peer-to-peer network is sometimes referred to as the "invisible college" - the worldwide college without walls that joins researchers in related fields.
  • Being active in scholarly societies (e.g. CSC, ACS) and communicating with your colleagues is vital to stay on top of your field!

 Searching on Computer Interfaces

  • " " is used by many interfaces to search for a phrase – Google does this.
  • Truncation - Most online catalogs and databases allow some kind of truncation, that is, replacing part of a word with a symbol to search for multiple words with the same root. For example, organo? might search for organochlorine, organohalogen, organometallic, etc. Some systems allow you to truncate single characters, some allow you to truncate internally, e.g. wom!n. There is little consistency as to which characters are used for truncation: * # ? ! $ are all used in various systems for various types of truncation. Usually there is a “Help” button where the rules are explained.
  • Boolean searching - Generally speaking, most systems use the operators of Boolean algebra: OR, meaning either "term A" or "term B"; AND meaning both "term A" and "term B" must be present; and NOT meaning "term A" is present, but records with "term B" are excluded. However, not all systems are identical. Be aware of the usage on the system in question.
  • Proximity -- If you enter multiple terms in a search window, some systems treat them as separate terms, some search them as phrases. Some allow you to specify the relationship of terms with proximity operators. Example: "term A" NEAR5 "term B" meaning that in a record the two terms have to be within five words of each other.
  • Stopwords - Usually words that are very common and lack subject meaning are not indexed, such as "a", "an", "the", prepositions, etc. In library catalogs, sometimes "a" "an" or "the" at the beginning of titles are omitted.  

 

Current Contents® / Physical, Chemical & Earth Sciences

Current Contents / Physical, Chemical & Earth Sciences provides access to complete weekly bibliographic information from articles, editorials, meeting abstracts, commentaries, and all other significant items in recently published editions of over 1,050 of the world's leading physical, chemical and earth sciences journals and books in broad range of categories. Only the titles from journals: English/ French/ German mixture.

PubMed - the Medical Literature Index

Scholarly journals in medicine and related areas of science and engineering.

·         Comprehensiveness

o        Journals only - around 4,300 journal titles, mostly English language. PubMed adds close to 600,000 new records per year.

o        PubMed is very comprehensive for medical journals.

·         Chronological coverage

o        Mid-1960's to present.

o        Articles indexed about two weeks to one month after publication date. In process records can be as little as a week old.

·         Access points

o        Searchable by keyword.

o        Searches may be limited to specific fields or by date, language, publication type (e.g., reviews).

o        PubMed uses the MeSH (Medical Subject Headings), created by the National Library of Medicine, for subject indexing. The MeSH headings are very detailed for medical topics, and use extensive subheadings for even more specificity.

o        Specialized limits include: age ranges, human vs. non-human animal, male vs. female.

o        NOTE: SciFinder Scholar (CAS – see next major heading) searches both PubMed and CAS at the same time and removes duplicates.

·         Search features

o        Truncation - * used for any number of characters at the end of a word. Note that PubMed will automatically map search terms to its thesaurus of MeSH headings, but use of truncation deactivates this feature.

o        Boolean operators - AND, OR, NOT available. Parentheses may be used for grouping terms.

o        Proximity - No proximity searching available, but does check for certain phrases.

o        Stopwords - automatically ignored.

o        Combining searches - Can do so by previous searches from search history; click on "History" link to view searches and combine them.

General comments: Best starting point for medical research. Note that some other versions of Medline take better advantage of the detailed Medical Subject Heading indexing available in the database. However, PubMed's free (i.e. taxpayer-supported) and public access has made it extremely popular.

 

Chemical Abstracts Service; http://www.cas.org

  • Chemical Abstracts Service was founded in 1907 as a division of the American Chemical Society.
  • Over 24 million documents total have been indexed (as of January, 2005.)
  • Acts as a repository of chemical information (a consequence of the excellent indices (“indexes”).

What CAS Does

Importance of Chemical Abstracts

o        CA attempts to cover chemistry in the broad sense...anything that might be interpreted as new research in chemistry or chemical engineering

    • Chemistry as the "central science". CA's coverage has high overlap with medicine, biology, physics, materials, agriculture, geology, etc., making it important for researchers in those fields as well.
    • Note: since CA focuses on "new research" in earlier times it did not index all chemical patents - only those deemed to have "new chemistry".
  • Comprehensiveness
    • CA attempts to cover the literature of chemistry worldwide, in any language.
    • It attempts to cover all forms of primary chemical literature.
    • Note that in some cases - technical reports and dissertations - it depends on secondary sources and indexers do not read the original documents.
  • Chronological coverage
    • Print CA began in 1907; electronic CA in 1967 - but now the whole CA collection back to 1907 has been digitized, and CAS has added to the electronic database selected records from 1900 to 1906. (http://www.cas.org/New1/scientific_century.html)
    • Abstracts are added to the print sections every week; in online form, updates are daily. Online, basic bibliographic information for the 1500 core journals and key patent authorities is online the day after receipt at CAS. Other types of documents, especially technical reports and dissertations, may have a significantly greater time lag.
    • Print abstracts get keyword indexing when published; detailed indexing when a volume is completed and indexes are cumulated every ten volumes. Electronic abstracts have detailed indexing added as it becomes available. Online records are first added with bibliographic data and abstracts only; detailed indexing is added as it is completed.
  • Access points
    • Weekly issues index by author, keyword and patent number. Volume indices index by author, subject heading, systematic chemical name, molecular formula and patent number
    • Electronic forms combine keyword and subject heading approaches
    • In the online form, links to Registry File add enhanced searching of chemical substances, including structure searching.

Arrangement of Abstracts in Print Chemical Abstracts

  • For ease of browsing, abstracts are grouped by subject area.
  • Currently there are 80 subject sections (see http://www.cas.org/PRINTED/sects.html), divided into five broad groups.
  • Abstracts are added in all sections each week.
  • Cross-references are used where a given abstract might legitimately appear in more than one section.
  • Note that subject sections change with time to reflect current research.
  • Subject Coverage Manual gives a detailed definition of each section and a table of changes over the years.
  • One volume per year was published until 1962 when they switched to two volumes per year. Collective Indexes where issued every ten years until 1957 and every five years since then.
  • Abstracts have been individually numbered only since 1967. From 1907-1932, pages were numbered and indices would refer to a page number, with a superscript denoting the order of the abstract on the page. Example: 3216, for the sixth abstract on page 321.
    From 1933-1966, each page had two columns of abstracts which were numbered, with letters running down the center of the page to identify where on the page the abstract fell. Example: 1733h would be near the bottom of page 1733.
    Since 1967, abstract numbers have been of the form 223717w, where the letter is meaningless except as a sort of check digit.

Contents of the Abstract Record

  • All CA records contain:
    • Title of the document
    • Author(s) or inventor(s) for patents
    • Corporate source or patent assignee information
    • Source Information, e.g. journal title, volume, issue, pages or patent numbers
    • Language
    • Abstracts (usually)
  • Author's names appear as given in the original document.
  • Abstracts for journal articles are usually those written by the author.
  • Patent abstracts may be fleshed out by the indexer.
  • Dissertations and some other documents have no abstracts.
  • Note that in the early days of CA, the abstracts tended to be much longer and more detailed; nowadays, the abstracts are usually the same as those in the published paper.

Abbreviations

  • Journal names are listed using CASSI abbreviations.
  • Corporate names are heavily abbreviated.
  • All abstracts use abbreviations for common chemical terms (see CAS Standard Abbreviations and Acronyms at http://www.cas.org/ONLINE/standards.html.)

Indexing in Print CA

  • For each volume there is an index of subjects, authors, formulas and patent numbers. However, the indexes to each volume become essentially superseded as collective indexes are issued.
  • The types of indexing available in CA reflect the constraints of print.
  • The indexing available in the weekly issues is that which can be done most quickly.
  • The indexing in the Volume and Collective Indexes is more systematic, but still reflects the limitations of print.
  • Issue Indexes
    • Author
    • Keyword
    • Patent
  • Volume & Collective Indexes
    • Author
    • Chemical Substance
    • General Subject
    • Molecular Formula
    • Patent

Author Indexing

  • Weekly Issues
    • All authors are listed by last name and initials only. The index gives only the abstract number. Examples:
      • Lipshutz B H 151869t
      • Little R D 152780u

Patents have entries for both inventor and assignee; their abstract numbers have P before the number.

  • Volume and Collective Indices
    • First authors get both the abstract number and title of the paper listed under their names.
    • The author name is not necessarily the form used in the article, but may be a standardized form of the name. (Note: in recent years, CAS has largely given up on name standardization and uses the form found in the document.)
    • Other authors are cross-referenced to the first author of the document. 
    • Even though CA tries to pull all of an author's works under one name, it cannot always distinguish authors with the same initials, so it alphabetizes by last name and initials, even where the full name is spelled out! Examples:
      • Ellis, A.
      • Ellis, Arthur Baron
      • Ellis, A. D.
      • Ellis, Anthony Ewart
      • Ellis, Avery K.
      • Ellis, Andrew Michael
      • Ellis, Albert T.
  • Spelling of Author Names: Be aware of special rules for handling certain names. Names with "Mc" or umlauted letters or transliteration from non-Roman alphabets can be tricky. Example:
    • Mössbauer is listed as Moessbauer 

The General Subject Index includes:

    • classes of chemical substances
    • physical and chemical phenomena
    • types of reactions
    • chemical technology
    • industrial processes and equipment
    • scientific names for living organisms
    • biological and medical terminology  

CA Index Guide

  • The Index Guide is the key printed tool for identifying the correct subject heading for any topic in Chemical Abstracts
  • Each Index Guide lists the approved headings in use for its period of coverage.
  • An IG is published at the beginning of each Collective Index period, with updates every 18 months until the final comes with the Collective Index itself.
  • Contents of the Index Guide
    • An alphabetical listing of the approved subject headings, with cross-references to related headings and descriptive notes.
    • Many common terms not used as headings are listed, with See references to the correct heading.
    • Many common and/or trade names for chemical substances are listed, giving the correct CA systematic name (and Registry Number!)
    • There are also appendices on the organization and use of the subject indexes; how CA indexers select headings; CA chemical nomenclature; and a hierarchical list of the headings.
  • Whenever you are doing a subject search, in print or online, it's a good idea to check the Index Guide!! And be sure to check the correct Index Guide for the years you are searching! .

Substance Indexing: The Challenge of Nomenclature

  • In order to ensure that each substance has a unique possible name, and to group "like" compounds together, CA has devised their own system of nomenclature (not necessarily IUPAC) and scheme for arranging them in the Chemical Substance Index.
  • Unfortunately, this system can be hideously complex. Here's a hideous example
    • Dodecahedrane (C20H20) used to be listed as simply dodecahedrane.
    • Then a systematic name was assigned:
      5,2,1,6,3,4-[2,3]Butanylidenedipentaleno [2,1,6-cde:2',1',6'-gha]pentalene, hexadecahydro-
    • Now it's treated as a member of the fullerene family:
      [5]Fullerane-C20-Ih
  • It is important to remember that the CAS nomenclature has changed over time, as in the case of dodecahedrane above. The most important change took place in 1972; nomenclature has been fairly stable since then. But if you are using the older literature, you may have to do some checking to be sure of the correct terminology.

Basic Rules of CAS Nomenclature

  • CAS indexers select the "main" part of the compound to act as the heading parent.
  • Substituents to the parent are listed after it. This is referred to as inverted order
  • What constitutes a parent compound and how it would be named are not always obvious, even to a chemist.
  • Examples
    • Toluene is
      Benzene, methyl-
    • ortho-Xylene is
      Benzene, 1,2-dimethyl-
    • Benzyl alcohol is
      Benzenemethanol
  • When there are multiple substituents, they are listed in alphabetical order, including the prefixes.
    • Carbon tetrachloride is
      Methane, tetrachloro-
    • CCl2F2 is
      Methane, dichlorodifluoro-
    • CCl3F is
      Methane, fluorotrichloro-
  • Polymers are listed by the monomer(s) or repeating unit, with polymer or homopolymer appended.
    • Teflon is
      Ethene, tetrafluoro-, homopolymer

Alphabetization of Compounds

  • Compounds are listed first by parent compound, with the parent compound itself first (with any qualifiers and categories), then by substituted forms in alphabetical order.
  • Substituents are read from left to right, ignoring numbers and punctuation.
  • Example: Benzene
    • Benzene
    • Benzene, analysis
    • Benzene, uses and miscellaneous
    • Benzene, compounds
    • Benzene, polymers
    • Benzene, azido-
    • Benzene, chloro-
    • Benzene, 1,2-dibutyl-

Special Cases: Salts

  • Salts of organic acids, or inorganic oxyacids are named as derivatives of the parent acid.
  • Potassium chloride is
    Potassium chloride
  • But: Potassium sulfate is
    Sulfuric acid, potassium salt (2:1)

Helps for finding CAS Chemical Names

  • In general, it can be very tricky to look at the structure of a complex compound and decide what the CA name will be.
  • Remember that some data collections give the CAS name for compounds: Merck Index, CRC Handbook of Chemistry and Physics, among others.

Using the Index Guide for Chemical Names

  • If the compound has a common or trade name, check the Index Guide.
  • The Index Guide is especially good for drugs, natural products, dyes, etc.
  • For other common chemicals, even if you can't find the specific chemical you want, you may be able to find a similar one and get a clue to follow.

Using the Registry Number Handbook for Chemical Names

  • Searching by compound Registry Number is a preferred approach when using any of the electronic forms of CA. However, there is no Registry Number index for printed CA.
  • CAS publishes a "handbook" which lists Registry Numbers and gives the CAS systematic name for the substance.
  • Remember that there are many sources you can use to find Registry Numbers which have good synonym indexes: Merck, HODOC, Combined Chemical Dictionaries (or the print equivalents), the Aldrich catalog,ChemFinder, Kirk-Othmer, etc.
  • On the other hand, you should also remember that different sources may give different Registry Numbers for what appears to be the same substance: examples: parent compounds with salts, stereoisomers, polymers.

Molecular Formula Index

  • While most molecular formulae have a large number of possible compounds, it is far easier to look at a possible name and decide whether it matches your compound than to guess at a name.
  • Note that the Molecular Formula Index just gives a list of abstract numbers, not a breakdown by subheadings.

Molecular Formula Index Organization

  • Molecular formulas are listed in Hill order:
    1. If carbon is present, it comes first, followed by hydrogen, then all other elements in alphabetical order.
  • Note that the rules for salts apply to molecular formulas, too.
  • Molecular Formula Examples
    • Benzene is C6H6
    • Teflon is (C2F4)x
    • Ferrocene is C10H10Fe
    • Hydrochloric acid is ClH
    • Benzoic acid is C7H6O2
    • Sodium benzoate is C7H6O2, sodium salt...NOT C7H6NaO2
    • Deuterium and tritium are represented by D and T.

Using the Ring System Handbook for Chemical Names

  • Most compounds with a polycyclic ring system use the name of the ring system as the parent compound.
  • The Handbook lists ring systems in order of:
    • Increasing number of rings
    • Increasing number of atoms in the ring
    • Increasing Hill order formula of the ring
    • Example:
      Sample Ring System
    • Step 1: Count number of rings, using the smallest rings in the structure which will take in all the atoms in the ring system - 5
    • Step 2: Count the number of atoms in each ring - 5, 6, 6, 5, 5
    • Step 3: Note the "molecular formula" of each ring - C5, C6, C5O, C4O, C5
    • Step 4: Arrange the formulas in order of increasing size and Hill order: C4O, C5, C5, C5O, C6
    • Step 5: Look up the ring systems that fit the formula and pick the correct one by inspection (not always easy).
    • Resulting name: 5H-4a,11a-epoxy-7,9a-methano-1H-cyclopenta[b]heptalene
  • The entry for a given ring system gives structure diagram with CAS locant numbers, name, Registry Number of the parent ring.
  • Rings which are less unsaturated will (for complex rings) have the same parent name, but with, for example, "decahydro" added.
  • Try = C5N-C6-C6

NOTE: Modern CAS searching is accomplished using the SciFinder Scholar interface, based on the paper-version CAS described above. SciFinder now has chemical structure diagrams. This tool is helpful, especilly when dealing with isomers.

Chemistry on the Web

Google Scholar (http://scholar.google.com/)

  • Google Scholar is a new initiative by Google to make scholarly information (journal articles, books, dissertations, preprints, etc.) available through the familiar Google interface.

Organic Chemistry Portal;  http://www.organic-chemistry.org/

The PubChem project; http://pubchem.ncbi.nlm.nih.gov/

PubMed;  http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PMC

BioMed Central; http://www.biomedcentral.com/browse/journals/

Public Library of Science; http://www.plos.org/

 

Mount Allison E-Journal Collection:

Go to http://www.mta.ca/library/find_articles.html .  From the alphabetic bar the following sources are most useful for chemists:

  • American Chemical Society Fulltext titles
  • CISTI source
  • LINK by Springer Verlag and Associated Publishers
  • Oxford Journals Online
  • Royal Society of Chemistry
  • Science Citation Index (Web of Knowledge)
  • Science Direct
  • Springer Verlag and Associated Publishers (LINK)

These sources allow full on-line access of journal articles at Mount Allison.

 

Reference for chemical literature searching:

http://www.library.ucsb.edu/classes/chem184/

http://www.library.yale.edu/science/help/propk.html#sol