python - How to parse Apple's IAP receipt mal-formatted JSON? -


i got json from apple this

{     "original-purchase-date-pst" = "2012-06-28 02:46:02 america/los_angeles";     "original-transaction-id" = "1000000051960431";     "bvrs" = "1.0";     "transaction-id" = "1000000051960431";     "quantity" = "1";     "original-purchase-date-ms" = "1340876762450";     "product-id" = "com.x";     "item-id" = "523404215";     "bid" = "com.x";     "purchase-date-ms" = "1340876762450";     "purchase-date" = "2012-06-28 09:46:02 etc/gmt";     "purchase-date-pst" = "2012-06-28 02:46:02 america/los_angeles";     "original-purchase-date" = "2012-06-28 09:46:02 etc/gmt"; } 

this not json know. in json it's defined that

each name followed : (colon) , name/value pairs separated , (comma).

how can parse in python's json (or simplejson) module?

json supports separators in json.dumps(), not in json.loads(), , in simplejson/decoder.py, def jsonobject() has hard-coded delimiter of : , ,.

what can do? write own parser?

that indeed rather messed up. quick fix replace offending separators regular expression:

line = re.compile(r'("[^"]*")\s*=\s*("[^"]*");') result = line.sub(r'\1: \2,', result) 

you'll need remove last comma:

trailingcomma = re.compile(r',(\s*})') result = trailingcomma.sub(r'\1', result) 

with these operations example loads json:

>>> import json, re >>> line = re.compile('("[^"]*")\s*=\s*("[^"]*");') >>> result = '''\ ... { ...     "original-purchase-date-pst" = "2012-06-28 02:46:02 america/los_angeles"; ...     "original-transaction-id" = "1000000051960431"; ...     "bvrs" = "1.0"; ...     "transaction-id" = "1000000051960431"; ...     "quantity" = "1"; ...     "original-purchase-date-ms" = "1340876762450"; ...     "product-id" = "com.x"; ...     "item-id" = "523404215"; ...     "bid" = "com.x"; ...     "purchase-date-ms" = "1340876762450"; ...     "purchase-date" = "2012-06-28 09:46:02 etc/gmt"; ...     "purchase-date-pst" = "2012-06-28 02:46:02 america/los_angeles"; ...     "original-purchase-date" = "2012-06-28 09:46:02 etc/gmt"; ... } ... ''' >>> line = re.compile(r'("[^"]*")\s*=\s*("[^"]*");') >>> trailingcomma = re.compile(r',(\s*})') >>> corrected = trailingcomma.sub(r'\1', line.sub(r'\1: \2,', result)) >>> json.loads(corrected) {u'product-id': u'com.x', u'purchase-date-pst': u'2012-06-28 02:46:02 america/los_angeles', u'transaction-id': u'1000000051960431', u'original-purchase-date-pst': u'2012-06-28 02:46:02 america/los_angeles', u'bid': u'com.x', u'purchase-date-ms': u'1340876762450', u'original-transaction-id': u'1000000051960431', u'bvrs': u'1.0', u'original-purchase-date-ms': u'1340876762450', u'purchase-date': u'2012-06-28 09:46:02 etc/gmt', u'original-purchase-date': u'2012-06-28 09:46:02 etc/gmt', u'item-id': u'523404215', u'quantity': u'1'} 

it should handle nested mappings well. assume there no escaped " quotes in values though. if there you'll need parser anyway.


Comments

Popular posts from this blog

c# - SVN Error : "svnadmin: E205000: Too many arguments" -

c# - Copy ObservableCollection to another ObservableCollection -

All overlapping substrings matching a java regex -