In a CMAJ article titled “Big data’s dirty secret” (Webster, June 6), the author highlights the barriers to leveraging big data in Canada and points to “technological limitations stemming from mismanagement by government e-health agencies and commercial turf battles” as an important cause of this problem. Webster also cites others who have had challenges with the data generated from EMRs currently used in Canada including Patricia Sullivan-Taylor, manager of primary health care information (Canadian Institute for Health Information) and Alex Mair, director of health system use for Canada Health Infoway, the organization responsible for developing and implementing Canada’s health Infostructure. According to Sullivan-Taylor, each EMR system produces different types of data, something that was not resolved before EMRs were installed in physician offices. Mair also acknowledges that progress has been slow in using “big data” and the resultant inability to gain insights into clinical data is a key gap for clinicians.
While I agree with many of the concepts and comments in the article, I would like to add further clarifications. Mario Bojilov recently published a concise overview of big data and defines it as “data sets that — due to their size (volume), the speed they are created with (velocity), and the type of information they contain (variety) — are pushing the existing infrastructure to its limits”. Big data, in healthcare, seems to have become synonymous with data analytics, “a process of inspecting, cleaning, transforming, and modeling data with the goal of highlighting useful information, suggesting conclusions, and supporting decision making” (Wikipedia); however, I belive it is misleading to broadly think of EMR systems in physician offices as generators of big data. While large networks of clinical users on a common EMR can certainly generage large amounts of data, many of the current limitations apply equally to “small data” as they do to the high velocity, volume, and variety of big data.
Simply put, there are many EMR systems that cannot effectively be used to generate insights from small collections of data, never mind big cumulative data sets. For example, EMR systems that record information in narrative format (not as discrete data) are unable to generate analytic reports for patient populations because they were not designed to function as analytical tools right from their early genesis. EMRs that collect data in highly structured formats are able to output information that can be queried using analytical tools that are either built into the EMR systems or third-party software that is able to take data sets generated by EMRs and provide data analysis as an integrated functionality or outside of the clinical system. However, taking data from multiple EMRs and merging it together is not possible. Why, you might ask? Well, because each EMR system has been built using proprietary data structures without any coordinated national effort to define national data standards for EMRs. The reason that Interac bank machines work is that all the messages and data structures are the same — irrespective of which bank you may use. The same cannot be said for EMR data. One of the great technology successes of the UK’s defunct National Program for IT was a project called GP2GP through which a patient’s electronic health records can be transferred directly and securely between GP practices. This was made possible because of standardized data structures.
As provincial EMR programs begin to wind down, starting with Alberta in 2014, and provinces transition to a support and optimization role for users of EMRs, the lack of data standards for EMRs is going to become a greater and greater problem. Short of completely redesigning existing EMRs using the same data structures and standards for all systems, it will be a nearly impossible task to reverse modify the EMRs used by 70% of target physicians. We are stuck with what we currently have. The policy failures of the past — insufficient support for clinical leadership in the early stages of EMRs and a lack of focus in developing and implementing standarized data structures and clinical messages for EMR systems (nationally and provincially) — have created the current pickle we are now in.
Before we can fully utilize and gain insights from big data, we had better sort out small data in EMRs. This needs to begin with a focus on data quality, the management of patient populations at the practice level, and ensuring that the clinical messages passed between EMRs use nationally-accepted standards so that data is interoperable and usable by other systems.
What do you think? Are there other priorities to consider in order to make data more usable by multiple EMRs? To add your thoughts, click on the “Comments” link below.