Data De-identification and G-tagging

The data de-identification process occurs through the De-identify Dataset / Add GTAGs functionality included in CAT4.
When an extract is run through the De-identify Dataset / Add GTAGs functionality;

all patients who have withdrawn consent for their data to be used in clinical research are removed;
the extract is de-identified by removing all identifiable information such as name, address and date of birth. The only personal information which remains in a de-identified extract is gender, ethnicity and the age of a patient in years.

The G-tagging process is described briefly following:

Patient addresses are isolated from other identifiable information and sent securely to a service hosted by the Australian National University (ANU). No other patient information is sent.
ANU returns a pair of unique, non-identifiable codes. These codes are used for mapping in PAT CAT.
One unique code is called the G-tag, this is inserted in the de-identified extract file.
The other code, the A-tag, is isolated from the extract file and is hosted on a secure Pen CS server.

Mapping of patient information requires both of G-tag and the A-tag to be present. This mapping only occurs to the level of an Australian Bureau of Statistics defined standardised statistical areas. A G-tag can never be used to gain a patient's original address. All mapping is of non-identifiable information
To ensure patient privacy, all mapping of patient data in PAT CAT is of aggregated data, individual patients are never shown. In fact, as detailed in Section 9.4, it requires at least five patients to be present within a map segment for any information to be shown.

Content

Space Tools