Navigating Precision: Unraveling Conflicting Dates in Medical Records
- timespacemedicine
- Apr 19, 2024
- 7 min read
Updated: Apr 18
By Jonathan D. Gold, MD MHA MSc FAMIA FHIMSS
April 22, 2024
Challenge yourself:
How does the Precision Matrix resolve conflicting dates within medical records?
What factors influence the reliability of temporal objects used to determine event dates?
Why are fully defined time/dates considered the most accurately recorded temporal points?
How can we address the reliability of input sources and data provenance?
As temporal data reside in multiple sections of the electronic medical record, numerous workflow opportunities arise during data capture and sharing of the record for acquiring temporal information.
Electronic Medical Record Temporal Sources
Time/Date stamp for entries (caregiver notes, actions [medication administration, procedures, examinations, lab orders and results, imaging, etc.], routine observations [vital signs] and monitoring)
Free text in notes
Defined time/date fields from standardized or custom forms or reports
Imported data and metadata around importation
Workflow Capture and Considerations
Real-time temporal analysis during note entry process
Temporal analysis during importation from external sources
Data Normalization, Temporal Collection and Communication
Gathering and normalization of temporal data from imported metadata
Capture of temporal fields in imported patient data
Precision Matrix
When multiple sources provide conflicting dates for the same event, we use additional rules related to the precision of the derived dates. The significance of the level of precision we have in a temporal object becomes apparent when we use ‘derived dates’. Derived dates extrapolate occurrence dates from the temporal object and the metadata (for tethered dates) or the degree of precision (for unlinked dates). All dates associated with events, whether they are fully defined and unlinked dates or derived dates, are used to map where events should be plotted on the patient’s health timeline.
By factoring in the degree of precision we have in each of the derived dates for a single event, we can consistently reconcile an event’s date of occurrence even when multiple sources provide conflicting dates. Other strategies not listed below might also be considered and pursued instead.[1]
The extent to which we can trust the temporal aspect of a documented event depends upon the reliability of the temporal objects used to determine an event’s date and the reliability of the source. The most reliable object is a one-time event, has a certainty which is ‘definite’, value modifier of ‘equal’, and value date of (hh:mm_mm/dd/yyyy) with an unlinked date.[2] The least dependable temporal object is a potentially recurring event, with ambiguous certainty, null values for value and measure, and an unlinked date.[3] Tethered dates may be more or less specific than unlinked, historic dates.
The following rules have been employed to construct the Precision Matrix:
Fully defined time/dates (hh:mm_mm/dd/yyyy) or dates (mm/dd/yyyy) are the most accurately recorded temporal points.
Tethered dates used for events occurring days prior to the metadata date for the medical record are more accurate than those that occur weeks prior to the record. Weeks prior to the record are more accurate than months, which in turn are more accurate than single digit years, which in turn are more accurate than double-digit years.[4]
Tethered dates, capturing events that occurred days to weeks before an event, are more accurate than unlinked partially defined dates (month/year). Tethered dates for events occurring months prior to the medical record metadata date are more accurate than unlinked defined year (only) dates.
Additional methodologies for determining an event’s start date should be considered:
Consider the event with the highest precision score as the correct date (the highest precision date). For any given facility, the most precise date should be considered the only event date for that database, and then the most precise representatives from all facilities (and databases) should be compared.[5] In instances where multiple accounts of an event with different derived dates have the same high degree of precision, these should either be averaged to find a single date (mean) or, alternatively, the overlapped date(s) in ranges for the most frequent dates found should be chosen (median).
A Derived Aggregate Date method considers conflicting accounts of the start date and uses a hypothetical mean for the date or duration of occurrence to provide a near approximation for the actual event’s date. Accounts are weighted by precision and source veracity before interpolating the aggregate date. An approximation is plotted on the patient’s health timeline.
Figure 1 is an example of a Precision Matrix.

Source Veracity Score
Source Veracity Score
1. Data Provenance
2. Input Source by input type
3. Intervening Interval between entry and event
4. Event Criticality
Data Provenance
Data provenance confirms the authenticity of data to enable trust in its origin and use. Provenance provides a trail accounting for the origin of a piece of data and tracking how it got to its current place in the record. Currently, data provenance is rarely captured in the metadata, and this becomes particularly difficult for data gathered from sources external to the system. Normalization adds an extra impediment. Nonetheless, provenance should be sought as we include data from multiple sources.
Input Source
Input sources vary and can have different origins like dates entered by the patient when filling out a form or in a personal health record, time periods captured by the clinician when interviewing the patient or reviewing external consultation notes, and system generated time-dates for admission/discharge or lab reports. Depending on the type of record, dates may be attached to entities automatically (e.g., for lab results, admission time, time-date stamp of a note or order entry) or entered manually (e.g., by a physician assigning start dates for diagnoses on a problem list or past medical history, or by capturing events in free-text in the note section) when the input source is a clinician capturing the statement of a patient or caretaker.
Methods that consider authorship for events vary but should include first-hand sources (like the patient’s or patient’s family’s descriptions). That said, the patient may not always provide the most reliable account for when an event occurred, how long it lasted or even what event occurred. For events that are only introduced by the patient in the personal health record (PHR) and which have no other record for transpiring in the electronic medical record (EMR), the veracity and precision might be doubtful. A patient may interpret symptoms and label them with an unfounded conclusion (like recording a diagnosis of “brain cancer” rather than “chronic headache”) when no definitive diagnosis has been objectively determined.
Particularly for non-recurring, chronic or one-time events that occurred prior to the implementation of an EMR, the patient may recall the start of an event more accurately than the EMR’s entry. This remains a weakness for the curation and incorporation of paper documentation into the medical record. New patients describing their past health experiences often are the sole or main source for medical history. The patient recalls that she was diagnosed as diabetic in 1987, but the electronic medical record’s problem list defaults the diagnosis date as the date of entry into the record (e.g., “2005” when the EMR system was introduced).
Intervening Interval
Generally, a more recent event, due to the shorter interval between occurrence of the event and recording it, increases the likelihood that the temporal aspects are more accurately captured.
Event Criticality
Life-threatening, critical and serious events typically are accompanied by greater and more granular documentation.
Alternate Strategies
If events can be classified into one-time, chronic, acute, and ambiguous categories, we should consider using category-specific precision hierarchies or strategies to arrive at the date of occurrence.
For one-time events, unlinked dates specific to a degree of HH:MM_mm/dd/yyyy and mm/dd/yyyy, would be given the highest precision, followed by tethered dates to current record entry and near (hours) and close (days, weeks) approximations. Unlinked partially defined dates (mm/yyyy) would be given precedence to a tethered approximate date (months). An unlinked and defined year (yyyy) should be higher than a tethered distant (years) approximation or unlinked “occurred” record. The highest precision date might be the best option.
The Precision Matrix probably is most suited for chronic disease. Alternatively, we should consider if the first derived date might be a more consistent option. Both strategies should be on the table. Particularly with chronic disease and one-time events, the vast majority of medical documentation for adult patients appears in the paper file. Different organizations have inconsistent policies regarding which data is curated to the electronic record. Variation in data accuracy may hamper interpretation for these two critical groups.
For acute disease, the precision matrix used for one-time events (as presented above) would be sufficient. We must also consider how to separate individual episodes of the same diagnosis (e.g., acute otitis media, acute gastroenteritis, etc.). These are not recurrences, but rather, unique instances.
The inclusion in many EMRs of a “copy-forward” function that allows users to take a previous note (potentially from days or weeks earlier), copy that note and insert it into a new note, increases the likelihood of confusing how a system’s NLP will interpret the note. Consider a note on January 10th that states the patient “complains of fever and chills for the past two days”. This note may be “copied-forward” on a note three days later during a follow up encounter after the patient has begun antibiotic treatment. From the first note, we might assume that the patient’s symptoms began on January 8th, but the copy-forward text in the second note implies that the patient’s symptoms began on January 11th (after the patient had already started treatment).
For ambiguous disease (“possibly had chicken pox as a child”), temporality should be considered as not plottable. There might be value in listing events deemed “not plottable” but of possible clinical importance (“polio in childhood”).
Conclusion
In the intricate landscape of medical documentation, the Precision Matrix can help guide us through the maze of conflicting dates. By weighing precision, considering data provenance, and assessing input sources, the matrix and source veracity score empower us to unravel temporal discrepancies with greater confidence, and enable us to reliably infer a timeframe for events.
[1] For example, capture overlapping date ranges for an event provided by multiple sources and determine the most common shared date(s).
[2] e.g., time of death.
[3] e.g., Previous Suspected Allergic Reaction to Bee Venom.
[4] A common rounding artifact has been cited for numbers that are multiples of 6 months or multiples of 5 for other temporal measurement units. See Hripcsak (ibid).
[5] This method fails to promote unlinked derived dates over tethered, but current dates. The disadvantage is that chronic events that are tethered, and simultaneously current, may display a higher precision than historic data that may be more accurate about the first diagnosis of the disease.

Comentarios