Tony says he wants:

here is a list of the things that we would like to have for protein 
interaction data:

0. Organism
1. Authors
2. Type of Interaction Assayed
2a. Reference (PMID at least)
3. A list of all interactors, use the IntAct ID and map from this to
  something (like Locus or EntrezID or...)
3. a list that maps Bait proteins to prey proteins 
  (this can be either with each interaction, or a with one element
    for each bait and the prey as values).

I think that is currently all that we need. Each bit of information is encoded 
in a IntAct ID (EBI-XXXXXX).  We should have this information available as
well as a map from these id's to the relevant information. 
I think that you had a preference for the gene locus. At least for the yeast 
data, it seems that a majority of the interactors can be mapped to a gene 
locus (but this mappling is not one to one). The problem is when we need to 
make a decision about which id to chose. Also, if there is no gene locus, 
I had chosen to use the protein name, but is that what we ought to do?

There is code in the RBioinf book (RDataTech chapter) and an example file
in the accompanying RBioinf package that shows how to do most of the stuff
above using the XML package and XPath, that is RGs preferred way of dealing
with this stuff.
