Wednesday, August 24, 2011

The need for decoupling wire format from semantic model

This is another related point as part of my suggestion to HL7v3 fresh task.

Firstly, lets have a brief definition about the exchange model and semantic model. Exchange model is the the model than determines the wire format typically expressed in XML Schema Definition (XSD) when developers can use to create WSDL (web service definition language) for web service invocation. Semantic model is the model that defines the precise meaning and relationship between different data elements, and data element binding to the terminology such as SNOMED CT.

Use HL7v3 as an example, RMIM (Refined Message Information Model) is the semantic model, such as "POLB_RM004000UV01" for lab result. The wire format is directly serialized from RMIM if you are using messaging oriented XML ITS. (of course the alternative in HL7v3 is using document oriented XML ITS, that it is CDA R2 wire format). So as we can see, currently there is no so called exchange model in v3, so what's the problem we will face?

Below is my original comment.

My comments and suggestions, particular to the issue raised by Lloyd below about the wire format incompatibilities due to additional requirements or refactoring.

>>The problem at the implementer/project level is that when you take the UV model and tightly constrain it to fit your exact requirements, you discover 6 months >>down the road that one or more of your constraints was wrong and you need to loosen it, or you have a new requirement that wasn't thought of, and this too >>requires refactoring and often results in wire-level incompatibilities.

In general I think we need to separate the exchange model (determines the wire format) from the information model, and defines consistent way for mapping between exchange model and information model. This will provide few benefits

1) The decoupling of exchange model from information model, allows the evolution of these two models independently to cater for different needs. The evolution of information model is to address ongoing business needs, and the exchange model is to address the technical needs. In this way we will be able to solve the dilemma Lloyd mentioned above to some extent though I agree that the information model versioning challenge still remains. But again this is the really the design issue faced by every solution designer in every industry. e.g when we design data model, we need to ensure that the updated data model able to handle new system requirement but also at the same time meet the existing system requirement, or at least the design need to ensure that the existing data in the old data model shall be able to be mapped to the updated data model. So similar kind of information model versioning and governess shall be in place.

2) The information model needs to be very rigorous and extremely hierarchical to ensure reliable understanding, persistence and query within the application, however that same level of complexity does not need to be cascaded down to the exchange model which needs to be simpler and flatter. (I think that is why data type is simplified on the wire though it is largely based on ISO21090), so every developer can easily understand what information needs to be exchanged with a quick look at the wire format.

And I am glad that Lloyd has agreed.

From: [] On Behalf Of Lloyd McKenzie
Sent: Saturday, August 20, 2011 12:40 AM
To: Victor Chai
Cc: Grahame Grieve; Eliot Muir; Zel, M van der; HL7-MnM; RIMBAA; HL7 ITS
Subject: Re: A Fresh Look Proposal

Hi Victor,

I agree that the separation of semantic model and wire model is desirable, and I think essential. Furthermore, I think it's possible that there could be more than one wire model for a single semantic model. The pharmacy model *definitely* needs to support complex chemo-therapy protocols at a fully encoded level. That doesn't mean we can't produce a simplified wire format model for community pharmacy that satisfies the 80/20 rule. If a particular implementer needs a few more elements, odds are they're already defined in the "big" model so the necessary data dictionary codes would already be there. Furthermore, if both a simple and complex wire model are created, an implementer of the complex wire model could easily interpret the simple wire format due to the underlying mappings to a common semantic model.

The other benefit of the split is that it allows the committee to be a bit less paranoid and move forward sooner without being as "complete". If some refactoring is needed to the semantic model, they can do that while still retaining the original "simple" wire format model.

Grahame, can you clarify what you think the "high price" is that you think is associated with doing this?


Another related comment from Bernd from Germany.

Dear Lloyd,

I'm really happy that we at HL7 are slowly going to really look beyond the wire syntax bowl. This was the intention as we have started in the late nineties the discussion to introduce and mature the RM-ODP at HL7. So I strongly support your last statement regarding the need for separation of wire syntax and semantic models. However, this also holds for other parts of the points you addressed, where this separation is not strictly followed.

Very best regards


The fallacy of constraining superset model

In HL7v3, though the overall development process is from RIM --> DMIM + CMET ---> RMIM,it not strictly enforced in the tooling. eg.after we developed DMIM for Lab domain, when we develop lab result specific RMIM, the RMIM is not strictly constrained from DMIM, rather than we use DMIM as base line, and then add or remove elements as appropriate or sometimes we directly develop RMIM from RIM. So it is not really a problem in HL7v3 though overall process is "design by constrains" but the actual content of the information model is not constrained from superset information model since RMIM is not strictly constrained from DMIM.

In other models such as openEHR and ISO13606 where it uses the archetype concept to develop re-usable data structure for all use cases, and then extend or constrain the archetypes for a specific use case requirement, there we will see the fallacy.

This idea of developing superset model is quite attractive at first glance, however in reality it becomes a burden and result in lot of issues and challenges at implementation, let me quote my original comment below to explain the fallacy.

Firstly, I found it is not useful at implementation level. For example the UV or even national level model defines 200 or more data elements with a lot of optional, yet the project specific requirement needs only 20 data elements, in this case the big model will likely confuse the implementer since he/she needs to fully understand what's the exact business use cases and relevance of all these other optional data elements and the potential impact to the project, and if the user does not really understand all these optional data elements, he/she will not be able to constrain the big model in the first place.

Secondly in order to address all possible requirements, the modeling process will be extremely long, in the end the modeler may just simply dump all the business data requirement he/she can think of without proper modeling rigorousness under time constraints or due to unclear use cases, and when use cases become clearer, the model needs refactoring. The model under this kind of development process, in software programming space, we call it "spaghetti code" - not sustainable code which makes the system extremely fragile and under constant refactoring whenever there is slight new or additional requirements. The other practical guidance is that why having the all the trouble to satisfy the 20 percent or even less of the needs at the sacrifice of the majority 80 percent or more needs? In the end those 20 percent or less needs is also not fully addressed.

Thirdly, from technical implement point of view, particularly web service implementation, the strategy for ensuring payload backward compatibility is to ensure the XML structure and data type expansion rather than contraction, e.g in the existing XML payload, we can define data type of one data element is "integer" since all existing systems are using integer, later when a new system requires it to be "string", you can safely expand the data type to "string" since type expansion won't break backward compatibility (for incoming request, not for outgoing response).Similar to data type, XML structure expansion is safer than contraction in ensuring backward with this technical reason and limitation, the modeling also should not try to be big fat one with loosely defined requirement, instead the model shall try to start with small and current known requirement, and evolve between different release.

Monday, August 15, 2011

Using HL7v2 in SOA

I am involved in HL7 Singapore technical committee, during one of the discussions about the topic for technical sharing, we discussed within the committee members what are the topics that are most relevant for Singapore. One of the member said that they would like to see how they can use HL7v2 message in their SOA service, the primary reason is that they feel comfortable with HL7v2, and possibly a lot of existing IT capabilities are built on HL7v2 message.

My initial reaction and response to them is that probably they can put the whole HL7v2 message within CDA R2 element, and extract out patient information from the V2 message and include it in CDA header. Essentially what is it doing is to transmit HL7v2 over HTTP protocol instead of MLLP protocol (interestingly, Australia seems thinking the other way round,see "Implementing CDA in a V2.x World" Forum )

But wait again, when application is processing v2 message, it still needs to transform the raw HL7v2 message to some form of object format or XML , so why not then just transform the v2 message to CDA R2 and expose new service using CDA r2 as payload, at the same time the existing v2 infrastructure is still in use. I really do not see it is that hard to get familiar with CDA R2.

What's your view?