Chronic Kidney Disease

Chronic kidney disease (CKD) is defined as an abnormality of kidney structure or function present for longer than 3 months. CKD can occur as a result of heterogeneous disorders affecting the kidney. In the United States, an estimated 13.6% of adults have CKD. Notably, adjusted mortality rates are higher for patients with CKD than those without, and rates increase with CKD stage. The purpose of this algorithm is to enable accurate CKD diagnosis and staging based on EHR data. The information on CKD stage/severity can be used to select appropriate cases for inclusion in genetic and epidemiologic studies. This information can also be used to design specific EHR-based interventions or clinical decision support tools for the diagnosis and management of CKD.
Our algorithm generally follows the National Kidney Foundation’s (NKF) Kidney Disease Outcomes Quality Initiative (KDOQI) CKD staging recommendations (, as well as the Kidney Disease: Improving Global Outcomes (KDIGO) 2012 Clinical Practice Guideline for the Evaluation and Management of CKD. Specifically, the NFK KDOQI guideline formalizes the definition and staging of CKD, while the KDIGO guideline provides an enhanced classification framework for CKD with a direct relationship to the prognosis and management of progression and complications of CKD. Overall, two measures of kidney disease severity are used to perform CKD staging: estimated glomerular filtration rate (G-staging) and albuminuria (A-staging).

Phenotype ID: 
List on the Collaboration Phenotypes List
Type of Phenotype: 
Phenotype Attributes: 
Ning Sunny Shang, Sumit Mohan, Ali G. Gharavi, Chunhua Weng, Paul Drawz, George M. Hripcsak, Krzysztof Kiryluk
Contact Author: 
Date Created: 
Thursday, August 25, 2016
Network Associations: 
Owner Phenotyping Groups: 

Suggested Citation

Ning Sunny Shang, Sumit Mohan, Ali G. Gharavi, Chunhua Weng, Paul Drawz, George M. Hripcsak, Krzysztof Kiryluk. Columbia University, Mayo Clinic, VUMC. Chronic Kidney Disease. PheKB; 2016 Available from:

PubMed References



    Thanks, David. 

    The relevant files are updated to V4.1 for this change. 

    Best regards, 


    Hi Sunny,

    At Geisinger, our lab results for 14958-3 and 13801-6 are random urine. 24 hour urine is entered as many random results. Is random urine acceptable? Can I use the total of these results over a 24 hour period?



    Thanks Ken. 

    I think the random urine is acceptable, as long as they measure Albumin/Creatinine(UACR) and Protein/Creatinine(UPCR) from the same urine sample. The purpose here is to get different types of urine tests (UACR, UPCR, A24, P24, spot urine albumin/protein test with corresponding spot urine creatinine and urinalysis test) to estimate proteinuria. In our site, we also have tests from random urines and the LOINC codes may not be exact same with the local labs.

    Best regards, 


    Please check the file CKDalgorithm_V4.1_codingUpdatedForUrineLabs_20170830.txt for updated urine tests' LOINC codes. Our data use non-LOINC codes for these labs but have mappings to them. So the LOINC codes are mainly based on our data. Different institutions may have different LOINC mappings. So the LOINC list may not be complete. Please let me know if your institution uses different LOINC code for relevant urine tests, then we can update relevant LOINC codes. 



    We just noticed from the "demo..." data dictionary tab that two of the variables requested (TYPE_2_DIABETES, and HYPERTENSION) require running separate, complete *other* eMERGE phenotypes, one for each variable.  Given that we have new subjects in eM3, this would require running a total of three phenotypes for CKD.  That is something we did not anticipate and is going to be a heavy lift for us.  Do you have alternative methods for populating these two comorbidity variables?


    We assume each site has implemented type 2 diabetes and hypertension phenotypes since these are eII phenotyeps. In CKD phenotype, we are reuse previous phenotyping results. If this is not the case, we can share with you our implementations for these two variables. Please let us know if this can help. 

    Thanks for your reply.  Your assumption thal all eMERGE sites have implemented all eMERGE 1, 2, and 3 phenotypes to all off their eMERGE 1, 2, and 3 cohorts is not correct.  You may recall at the beginning of eMERGE-3 there were decisions made about which eMERGE-2 phenotypes would be re-computed for new eMERGE-3 subjects, and your two secondary phenotypes were not among them.  Is there still value to you in our site implementing your CKD algorithm on a limited set of our subjects (i.e., only subjects that were part of our eMERGE-1 and eMERGE-2 cohorts)?



    Thank you, David. We prefer implementing the CKD algorithm to all subjects. If you don't have T2DM and Hypertension results on new eMERGE-3 subjects, you can mark them as missing value. If you would like to re-run the algorithms on eMERGE-3 subjcets, we do have sql query scripts for T2DM and Hypertension to share. But this will require some revision according to your local database. Please let us know. 

       The pediatric sites did not run these two algorithms - they must have been for adults only and/or in eMERGE-1.  If the algorithms do cover pediatrics, what dialects of SQL are available for the 3 algorithms CKD, T2DM and Hypertension?

    We are getting an error message (not data related) when attempting to upload our files through PheKB.  Do you prefer we provide them directly or should we wait until we can go through the website?

    Problem resolved.  We used the SQL Server script and SQL Server to generate the data files and the default encoding for exporting data is UTF-8. When we converted our files to ANSI the error on loading disappeared.