Clinical care guidelines recommend that newly diagnosed prostate cancer patients at high risk for metastatic spread receive a bone scan prior to treatment and that low risk patients not receive it. The objective was to develop an automated pipeline to interrogate heterogeneous data to evaluate the use of bone scans using a two different Natural Language Processing (NLP) approaches.
Breast cancer is the most common cancer and the second leading cause of cancer-related death among women in the U.S. Known breast cancer risk factors include age, race/ethnicity, reproductive factors, and benign breast disease. Family history of breast cancer and hereditary cancer syndromes, such as BRCA1/BRCA2 mutations, confer the strongest risk for this disease.
Carotid artert atherosclerosis disease (CAAD) is measured in cases and controls by both structured data, including ICD diagnosis codes, and quantitative measurements of carotid stenosis based on doppler and other imaging technologies.
The phenotype algorithm includes typical eMERGE pseudo code for implementing the structured data components of the algorithm, as well as a portable natural language processing (NLP) system used to extract percent stenosis measurements from imaging reports.
This algorithm is for community associated MRSA (Methicillin-resistant Staphylococcus aureus, read more at http://en.wikipedia.org/wiki/MRSA). Note this algorithm will use lab results and not ICD-9 codes, as ICD-9 codes are not specific enough for this algorithm and/or are not used consistently for this phenotype. Thus, we expect the actual number of cases to be higher than what the eMERGE RC (Record Counter) estimated, and, as we will be studying patients aged 0 to 89, we would like for all sites to participate, please...
Algorithm to select subjects with "normal" electrocardiograms. Subjects do not have heart disease, interfering medications, or abnormal electrolytes at the time of the normal ECG. Individuals may, however, develop abnormalities later in life.
Hypothetical timeline for a single patient:
Please refer to attached documents.
Chronic kidney disease (CKD) is defined as an abnormality of kidney structure or function present for longer than 3 months. CKD can occur as a result of heterogeneous disorders affecting the kidney. In the United States, an estimated 13.6% of adults have CKD. Notably, adjusted mortality rates are higher for patients with CKD than those without, and rates increase with CKD stage. The purpose of this algorithm is to enable accurate CKD diagnosis and staging based on EHR data.
Note: Attached documents contain full case definition and two different control definitions. One is for controls with 2 years of follow up, the other for controls with 1 year of follow up. All available controls with 2 years of follow up were used in Vanderbilt's study. The control population was supplemented by controls with only 1 year of follow up. At the time of study, many of the available controls had experienced their qualifying events somewhat recently and 2 years had not yet passed for full follow up.
Clostridium difficile, also known as "C. diff," is a species of bacteria that causes severe diarrhea and other intestinal disease when competing bacteria in the gut have been wiped out by antibiotics (see Wikipedia entry). In rare cases a C. diff infection can progress to toxic megacolon which can be life-threatening. In a very small percentage of the adult population C. difficile bacteria naturally reside in the gut. Other people accidentally ingest spores of the bacteria while patients in a hospital or nursing home.
A pheontype defining patients with strong evidence of having been diagnosed with colorectal cancer (cases) and patients who clearly do not have such diagnoses (controls). This phenotype is being used for sequencing studies. The only NLP involved in this phenotype is a very simple string search applied to pathology reports.