SciELO - Scientific Electronic Library Online

vol.106 issue2Carboxyhaemoglobin levels in water-pipe and cigarette smokersErratum author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand



Related links

  • On index processCited by Google
  • On index processSimilars in Google


SAMJ: South African Medical Journal

On-line version ISSN 2078-5135
Print version ISSN 0256-9574

SAMJ, S. Afr. med. j. vol.106 n.2 Pretoria Feb. 2016 



Are central hospitals ready for National Health Insurance? ICD coding quality from an electronic patient discharge record for clinicians



R E DyersI, II; J EvansIII; G A WardIV, V; S du PlooyVI; H MahomedVII, VIII

IMB ChB, FCPHM (SA); Division of Community Health, Department of Interdisciplinary Health Sciences, Faculty of Medicine and Health Sciences, Stellenbosch University, Tygerberg, Cape Town, South Africa
IIMB ChB, FCPHM (SA); Western Cape Government Department of Health, Cape Town, South Africa
IIIBSc (Med), PhD; Western Cape Government Department of Health, Cape Town, South Africa
IVMB ChB; Western Cape Government Department of Health, Cape Town, South Africa
VMB ChB; School of Public Health and Family Medicine, Faculty of Health Sciences, University of Cape Town, South Africa
VIWestern Cape Government Department of Health, Cape Town, South Africa
VIIMB ChB, MMed, PhD; Division of Community Health, Department of Interdisciplinary Health Sciences, Faculty of Medicine and Health Sciences, Stellenbosch University, Tygerberg, Cape Town, South Africa
VIIIMB ChB, MMed, PhD; Western Cape Government Department of Health, Cape Town, South Africa





BACKGROUND: South Africa (SA)'s planned National Health Insurance reforms require the use of International Statistical Classification of Diseases (ICD) codes for hospitals to purchase services from the proposed National Health Authority. However, compliance with coding at public hospitals in the Western Cape Province has been challenging. A computer application was developed to aid clinicians in integrating ICD coding into the patient hospital discharge process.
OBJECTIVES: To evaluate the quality of ICD codes captured using the application and predictors thereof in a single hospital department.
METHODS: After 6 months, the quality of ICD codes was determined by comparing ICD code descriptors with medical concepts in a random sample of original patient records selected over a 6-week period. Patient and personnel characteristics influencing quality of coding, derived from a theoretical framework, were collected.
RESULTS: Of 223 patient records, 45.3% (95% confidence interval (CI) 38.8 - 51.9) had complete ICD codes. Primary ICD code accuracy was 74.0% (95% CI 67.8 - 79.5). Patient characteristics such as female gender, younger age group and fewer comorbidities, as well as seniority of clinician rank, were significantly associated with ICD coding being complete on adjusted analysis.
CONCLUSIONS: The results of this study describe ICD coding quality at a central hospital in SA supported by a computer application and the factors influencing this. More interventions are required to achieve reliable coding data, such as additional ICD coding validation tools, training and oversight of junior clinicians.



More than 100 countries, including South Africa (SA), use the 10th revision of the International Statistical Classification of Diseases and Related Health Problems of the World Health Organization (ICD-10) codes to promote international comparability in the collection, classification, processing and presentation of morbidity and mortality statistics. Healthcare facilities generally use ICD codes for determining the acuity and severity of cases as well as to assess quality of care. In SA, ICD codes are primarily used by the public sector for patient billing and morbidity and mortality surveillance. Reliable hospital morbidity profiles are necessary to prioritise and fund appropriate public health interventions.

Despite policy requiring clinicians to use ICD codes for all patients in the public sector, the implementation of ICD coding has not been systematically monitored for coverage and quality. Unlike the private sector, which relies on dedicated coders, public sector clinicians are expected to code diagnoses of all inpatients themselves. There is evidence to support the need to improve ICD coding quality for mortality surveillance in SA.[1] However, little has been published on the quality of ICD coding for morbidity surveillance in SA. Purchasing by means of diagnosis-related groups (DRGs) is mentioned in the National Health Insurance (NHI) policy, for which ICD coding will be essential.[2] In a review of NHI pilot site performance 18 months after the start of the pilot, it was noted that the ICD system was not yet operating satisfactorily and would need to be strengthened.[2]

Support processes such as initial orientation and training, ongoing coder training programmes, coder accreditation, communication between health professionals, peer review and review by superiors are key to improving ICD coding quality.[3] It has also been recommended that appropriate assistive tools such as coding software and guidelines should be available to coders as well.[4] Although training programmes have been shown to have a positive impact on the quality of ICD codes by professional coders,[5] clinicians find it difficult to commit to costly, time-consuming accredited ICD coding courses.

Concurrent implementation of all the abovementioned recommendations would be difficult within the current capacity of the Western Cape Government Department of Health. As a compromise, a process was initiated to improve the quality of ICD coding by commissioning a software application for discharge summaries, the electronic Continuity of Care Record (eCCR). It was designed to assist clinicians in preparing discharge summaries by integrating it with ICD code lookup functionality.



The primary objective of this study was to evaluate the completeness and accuracy of ICD codes captured using the eCCR tool at a central hospital in the Western Cape Province of SA. The study also explored patient and clinician characteristics that may be associated with ICD coding accuracy and completeness.



Study design

This was a cross-sectional study in which the quality of eCCR data was assessed in relation to patient records.

Study setting and population

The study was conducted in the internal medicine wards of a central hospital in the Western Cape. Hard-copy patient records and the eCCR were reviewed for patients who were discharged over the 6-week period 22 July 2013 - 30 August 2013. During this period it was required that all patients admitted to general internal medicine wards receive discharge summaries prepared using the eCCR.

Sample size

A total of 477 patients were discharged from the general internal medicine department during the 6-week study period. Owing to the relatively small size of this population (<5 000), the normal approximation to the hypergeometric distribution was used to estimate the sample size. With an estimated expectation of 50% completeness and accuracy, a level of confidence of 95% and a significance level of 5%, a required sample size of 214 was obtained. This was inflated by 10 to address the possibility that original patient records might be missing. Two hundred and twenty-four records were randomly selected from the eCCR database.

Data collection

Data were collected from original patient records, the human resource management information system and the eCCR.

The ICD codes from the eCCR were checked against discharge summary narratives and original patient records by the principal investigator and an ICD coding expert. Similar to a method used by Chute et al.,[6] primary ICD codes for each patient record were reviewed and classified as:

  • Match. The primary diagnosis in the patient record was coded to the highest level of detail available in the ICD-10 Master Industry Table, SA version - June 2013 (SA MIT).
  • Partial match. The primary diagnosis in the patient record was within the scope of the medical concept of the chosen primary ICD code descriptor, but not to the highest level of detail available in the ICD-10 SA MIT.
  • No match. The primary diagnosis in the patient record was not within the scope of the medical concept of the chosen primary ICD code.

For this study, discharge summary clinical narratives in the original patient record were regarded as the source for the 'relevant clinical concepts' of the clinical encounter, based on the assumption that the clinician had diligently abstracted the most relevant clinical information of the patient episode. The eCCR discharge summary narratives were reviewed for any free-text relevant clinical concepts that should have been coded by the clinician as secondary diagnostic codes. The number of free-text clinical concepts that were not coded was recorded. The definitions of terms used in this report are set out in Table 1.



Data from the patient records and eCCR were entered onto predesigned data collection forms and then entered directly into a piloted, preformatted and locked Microsoft Excel 2011 spreadsheet by the principal investigator and checked by an expert in ICD coding.

Inclusion criteria

  • Records of inpatients who were discharged using the eCCR in the internal medicine department at a central hospital between 22 July 2013 and 30 August 2013.

Exclusion criteria

  • Where there were records of repeat admissions within the study period, only the first admission was selected.
  • Records of patients who died in hospital before discharge were excluded.
  • Patient records where the original paper patient record could not be found after three requests on separate dates were excluded.
  • ICD-9 codes, used for coding procedures, were excluded.

Measurement tools

The ICD-10 (SA version, June 2013) was used as a reference for checking the accuracy and completeness of ICD codes. Instructional notes from the Centers for Disease Control, Atlanta, USA, were used to assist in appraising the ICD coding. These two resources were available to clinicians utilising the eCCR during the study period. Patient data were collected from folders and clinician characteristics from human resources records.

Statistical analysis

Primary ICD codes were regarded as accurate if they were classified as a 'match' as described above. ICD coding of a record was regarded as complete if all the relevant clinical concepts were represented by at least a 'partial match'.

Data were exported from Excel to and analysed in Stata, version 13.1, with p-values of <0.05 regarded as statistically significant. Means and 95% confidence intervals (CIs) and medians and interquartile ranges (IQRs) were calculated for continuous and count variables. Categorical variables were described with proportions and 95% CIs.

Multiple logistic regression was used to determine the associations between ICD coding completeness and patient as well as clinician characteristics. Similarly, associations between ICD coding accuracy and patient as well as clinician characteristics were explored. We reported on crude and adjusted odds ratios (ORs) with 95% CIs and p-values. Based on the assumption that these were the most likely factors that would modify ICD code quality, the adjusted regression model included patient age, gender, fee paying, comorbidity and the clinician's position. The number of summaries prepared per clinician varied, introducing a cluster design effect, which was adjusted for the analysis.

Ethics approval

The study was approved by the Stellenbosch University Health Research Ethics Committee, and was conducted according to accepted and applicable national and international ethical guidelines and principles, including those of the Declaration of Helsinki, October 2008. Permission was obtained from the Provincial Health Research Committee to conduct the research and to access the routine data required. Patient identifiers were removed prior to analysis and reporting.



Included records

Of the 224 electronic and paper records that were requested, one was excluded on the basis that the patient died before discharge. There were no missing folders, and 223 records were included in the analysis.

Patient and clinician characteristics

Descriptive characteristics of patients and clinicians are shown in Table 2. There was no statistically significant difference in the number of comorbid conditions between male and female patients (p=0.84). Patients aged >40 years accounted for about two-thirds of the sample, paying patients accounted for 62.8% of the sample, and interns accounted for just over half of the 33 clinicians.



ICD-10 coding completeness and accuracy

While 165 (74.0%) of the 223 patient discharge records' primary ICD codes were accurate (i.e. had a complete match), only 101 records (45.3%) had complete sets of codes for the admission. There were 192 patient discharge records (86.1%) with at least a partial match of the primary ICD code to the primary discharge diagnosis. Six hundred and sixty (75.4%) of the 875 diagnostic clinical concepts in all the discharge summaries were coded.

Factors associated with ICD-10 coding completeness and accuracy

Patient characteristics of female gender, age <40 years and fewer comorbid conditions were significantly associated with the completeness of ICD coding in the bivariate and multivariate analyses. The records of female patients had twice the odds of having a complete set of diagnostic codes compared with males. Patients aged >40 years were 50% less likely to have a complete set of codes compared with those aged <40 years. The odds of the ICD codes being complete in the patient record decreased by 40% each time an additional comorbid condition required encoding.

There was a significant increase in the odds of ICD coding completeness with an increase in clinician age in the bivariate analysis. However, this relationship was no longer significant in the adjusted analysis. Compared with interns, registrars were more likely to have produced a complete set of ICD codes for a patient record in both the crude and adjusted analyses. Registrars had nearly three times the odds of correctly coding the patient's primary diagnosis compared with interns in the bivariate analysis, but this was no longer significant in the adjusted multivariate analysis. Details of the associations and CIs are summarised in Tables 3 and 4.







The results of this study describe the completeness and accuracy of discharge ICD coding using an electronic discharge summary application and the factors influencing these. About three-quarters of patients included in the analysis were assigned an accurate primary diagnosis code on discharge using the eCCR tool, with just under half receiving complete sets of discharge diagnosis codes. The 86.1% at least partial match of primary ICD codes in the complex setting of internal medicine suggests that the data from the eCCR may be acceptable for high-level description of morbidity of hospital patients. However, this may not be an acceptable standard for the purposes of revenue retrieval and compliance with financial prescripts. Having a quarter of inaccurately coded patient records may have negative consequences on DRG formulation and purchases by regional and central hospitals from the National Health Authority as proposed in the NHI policy.

Both patient and clinician factors were found to affect ICD-10 coding quality.

Increasing comorbidity had a negative association with the quality of ICD codes, possibly because the clinician was faced with the technical challenges of finding the correct ICD code for each additional clinical concept, while working in a time-constrained environment. A study of the acute hospitalisation needs of adults admitted to public facilities in Cape Town found that comorbid disease was present in 78.1% of all medical admissions.[7] This is an indication of the proportion of patients who would have an increased chance of having their admission episodes incompletely coded, and therefore a highly relevant finding.

The statistically significant relationship between ICD coding completeness and age >40 years is probably linked to age 40 being a recognised age of onset of comorbid diseases of lifestyle such as diabetes and hypertension.[8] This finding therefore reflects the higher number of a patient's comorbid conditions, rather than his/her increased age, as an independent predictor of coding incompleteness. The statistically significant finding that records of female patients were twice as likely to have a complete set of ICD codes as those of male patients may relate to clinicians experiencing difficulty in encoding medical conditions that were more prevalent among males than females in this study population. This requires further investigation into the morbidity profiles of the patients.

Although the statistically significant inverse relationship between primary ICD coding accuracy and the patients' paying status in the adjusted results may raise concerns about clinician-generated data quality for revenue retrieval, it may, on the other hand, justify targeted support for coding paying patients' medical records. Perhaps clinicians assumed that case managers would provide more comprehensive codes and that free-text narrative was adequate. The clinicians' experience in clinical practice, particularly in the skill of summarising inpatient episodes, may account for the statistically significant relationships between clinician age and clinician rank with ICD coding quality. It may be worthwhile for clinical departments to explore processes whereby more experienced clinicians check the ICD codes in the discharge summaries prepared by their junior colleagues.

The conflict between everyday clinical terminology and the descriptors of the ICD coding system may have contributed to inaccurate and incomplete coding.[9,10] Hohnloser and Purner[11] noted that imposing too many restrictions and non-editable lists, especially mandatory ICD coding, drove users away from their discharge summary application back to manual documentation. They therefore allowed substantial parts of the discharge summary to be entered as free text. Unlike Hohnloser's Patient Archiving and Documentation System (PADS), the eCCR did not allow users to print a discharge summary unless a primary ICD code and description was selected from a non-editable list.[11] While Hohnloser et al.[12] noted that as many as 84% of relevant clinical concepts may be shifted to the free-text section of the discharge summary when clinicians are forced to code manually, only 24.6% of these were represented as free text in the eCCR. In our SA context, clinicians were more willing to accept some restrictions, the reasons for which require further exploration.

The quality of ICD coding in this study should be interpreted together with previous research findings that, because of the limitations in the design of the ICD system, it may not be possible ever to achieve perfect results.[9,13,14] Chute and co-workers[6,14] noted that none of the classification systems was able to capture all clinical concepts that were of interest to clinicians. However, there are key lessons about some of the enablers of, and barriers to, ICD coding in this study, particularly if these are viewed as part of a quality improvement cycle. There may be benefits to looking beyond ICD coding and rather seeking overall improvement in the entire discharge process with 'the use of checklists, alerts, and predictive tools; embedded clinical guidelines that promote standardized, evidence-based practices; electronic prescribing and test-ordering that reduces errors and redundancy; and discrete data fields that foster use of performance dashboards and compliance reports'.[15] All these benefits would not only free up clinicians' time to apply their minds to ICD coding, but also provide practical ways of managing competing demands during the discharge process.

Study limitations

The issue of temporality is an important limitation of this cross-sectional study. Causality between the independent and dependent variables cannot be assumed. Furthermore, the relationship between these variables may be confounded by a number of factors that were not investigated in this study.

The study did not include the ordering of ICD codes as a measure of ICD coding quality. The use of only one clinical discipline at only one hospital limits the generalisability of the results. However, given the complexity of a central hospital, it may be reasonable to assume that coding accuracy may be better in a less complex environment, recognising that there are likely to be differences in the clinical practice and administrative processes between different hospitals and between clinical disciplines.

The significant findings of associations between patient or clinician characteristics and ICD coding quality may reflect the fact that particular types of patients are assigned to particular types of clinicians, and these types of clinicians may have a tendency to get the ICD codes either right or wrong. The investigators sought to compensate for this with cluster analysis. Addressing this potential bias by randomising patient or clinician assignment and balancing the number of discharges per clinician would have created an artificial scenario that did not exist on the service delivery platform. The investigators used a more pragmatic approach to research this subject in order to retain the complexity of the actual healthcare delivery setting, thereby making the results more meaningful for translation into policy.

The study findings suggest that the use of a tool such as the eCCR has the potential to improve ICD coding quality, thereby aiding in the implementation of NHI policy in central hospitals. The eCCR was developed not only to help clinicians with ICD coding, but also to help them manage competing clinical processes in a comprehensive and structured manner. This study was, however, not designed to measure the impact of the eCCR as an intervention, or to gauge clinicians' experience with the tool. These aspects will be described in future publications where more appropriate study designs will have been used.



These cross-sectional study results describe the baseline of ICD coding quality in a central hospital setting in SA. More work is required to improve morbidity surveillance data to a standard that can inform public health policy. The integration of clinical concept coding into the discharge summary may aid clinicians in producing ICD codes of fair quality. Further experimental research of the eCCR or similar software should be considered in additional hospital settings, with a view to integrating it within the routine hospital information system. Additional ICD coding validation tools, training, oversight of junior clinicians and co-ordination of competing processes are also recommended.

Acknowledgments. We thank the following people in the Western Cape Government Department of Health for providing us with data for this research: Krish Vallabhjee, Ian de Vega, Lesley Shand, Wendy Bryant, Rashida Adam, Nadine Ross and Adam Loff. We also thank Peter Raubenheimer, Tracey Naledi, David Coetzee, Lilian Dudley, Stephan Fourie, Lyn Hanmer and Suzette Munro for providing technical support, and Justin Harvey, Roderick Machekano and Nesbert Zinyakatira for statistical support.



1. Nojilana B, Groenewald P, Bradshaw D, Reagon G. Quality of cause of death certification at an academic hospital in Cape Town, South Africa. S Afr Med J 2009;99(9):648-652.         [ Links ]

2. Matsoso MP, Fryatt R. National Health Insurance: The first 18 months. S Afr Med J 2013;103(3):156-158. []        [ Links ]

3. Groom A. Congratulations! You've passed the coding course. Paper presented at the International Federation of Health Records Organizations Congress, Melbourne, Victoria, October 2000. Reprinted in ICD Coding Newsletter (November 2000). Melbourne: Victorian ICD Coding Committee and Victorian Department of Human Services, 2000:97-100.         [ Links ]

4. Hohnloser JH, Purner F, Kadlec P. Coding medical concepts: A controlled experiment with a computerized coding tool. Med Inform (Lond) 1996;21(3):199-206. []        [ Links ]

5. Lorenzoni L, Da Cas R, Aparo UL. The quality of abstracting medical information from the medical record: The impact of training programmes. Int J Qual Health Care 1999;11(3):209-213. []        [ Links ]

6. Chute CG, Cohn SP, Campbell KE, Oliver DE, Campbell JR. The content coverage of clinical classifications. J Am Med Inform Assoc 1996;3(3):224-233. []        [ Links ]

7. De Vries E, Raubenheimer P, Kies B, Burch VC. Acute hospitalisation needs of adults admitted to public facilities in the Cape Town Metro district. S Afr Med J 2011;101(10):760-764.         [ Links ]

8. Garcia-Olmos L, Salvador CH, Alberquilla A, et al. Comorbidity patterns in patients with chronic diseases in general practice. PLoS One 2012;7(2):e32141. []        [ Links ]

9. Chute CG, Huff SM, Ferguson JA, Walker JM, Halamka JD. There are important reasons for delaying implementation ofthe new ICD-10 coding system. Health Aff (Millwood) 2012;31(4):836-842. []        [ Links ]

10. Jiang G, Pathak J, Chute CG. Formalizing ICD coding rules using Formal Concept Analysis. J Biomed Inform 2009;42(3):504-517. []        [ Links ]

11. Hohnloser JH, Purner F. PADS (Patient Archiving and Documentation System): A computerized patient record with educational aspects. Int J Clin Monit Comput 1992;9(2):71-84. []        [ Links ]

12. Hohnloser JH, Puerner F, Soltanian H. Improving clinician's coded data entry through the use of an electronic patient record system: 3.5 years experience with a semiautomatic browsing and encoding tool in clinical routine. Comput Biomed Res 1996;29(1):41-47. []        [ Links ]

13. Watzlaf VJ, Garvin JH, Moeini S, Anania-Firouzan P. The effectiveness of ICD-10-CM in capturing public health diseases. Perspect Health Inf Manag 2007;4:6.         [ Links ]

14. Chute CG. Clinical classification and terminology: Some history and current observations. J Am Med Inform Assoc 2000;7(3):298-303. []        [ Links ]

15. Silow-Carroll S, Edwards JN, Rodin D. Using electronic health records to improve quality and efficiency: The experiences of leading hospitals. Issue Brief (Commonw Fund) 2012;17:1-40.         [ Links ]



R E Dyers

Accepted 28 September 2015.

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License