python - How to parse Apple's IAP receipt mal-formatted JSON? -
i got json from apple this
{ "original-purchase-date-pst" = "2012-06-28 02:46:02 america/los_angeles"; "original-transaction-id" = "1000000051960431"; "bvrs" = "1.0"; "transaction-id" = "1000000051960431"; "quantity" = "1"; "original-purchase-date-ms" = "1340876762450"; "product-id" = "com.x"; "item-id" = "523404215"; "bid" = "com.x"; "purchase-date-ms" = "1340876762450"; "purchase-date" = "2012-06-28 09:46:02 etc/gmt"; "purchase-date-pst" = "2012-06-28 02:46:02 america/los_angeles"; "original-purchase-date" = "2012-06-28 09:46:02 etc/gmt"; }
this not json know. in json it's defined that
each name followed : (colon) , name/value pairs separated , (comma).
how can parse in python's json (or simplejson) module?
json
supports separators
in json.dumps()
, not in json.loads()
, , in simplejson/decoder.py
, def jsonobject()
has hard-coded delimiter of :
, ,
.
what can do? write own parser?
that indeed rather messed up. quick fix replace offending separators regular expression:
line = re.compile(r'("[^"]*")\s*=\s*("[^"]*");') result = line.sub(r'\1: \2,', result)
you'll need remove last comma:
trailingcomma = re.compile(r',(\s*})') result = trailingcomma.sub(r'\1', result)
with these operations example loads json:
>>> import json, re >>> line = re.compile('("[^"]*")\s*=\s*("[^"]*");') >>> result = '''\ ... { ... "original-purchase-date-pst" = "2012-06-28 02:46:02 america/los_angeles"; ... "original-transaction-id" = "1000000051960431"; ... "bvrs" = "1.0"; ... "transaction-id" = "1000000051960431"; ... "quantity" = "1"; ... "original-purchase-date-ms" = "1340876762450"; ... "product-id" = "com.x"; ... "item-id" = "523404215"; ... "bid" = "com.x"; ... "purchase-date-ms" = "1340876762450"; ... "purchase-date" = "2012-06-28 09:46:02 etc/gmt"; ... "purchase-date-pst" = "2012-06-28 02:46:02 america/los_angeles"; ... "original-purchase-date" = "2012-06-28 09:46:02 etc/gmt"; ... } ... ''' >>> line = re.compile(r'("[^"]*")\s*=\s*("[^"]*");') >>> trailingcomma = re.compile(r',(\s*})') >>> corrected = trailingcomma.sub(r'\1', line.sub(r'\1: \2,', result)) >>> json.loads(corrected) {u'product-id': u'com.x', u'purchase-date-pst': u'2012-06-28 02:46:02 america/los_angeles', u'transaction-id': u'1000000051960431', u'original-purchase-date-pst': u'2012-06-28 02:46:02 america/los_angeles', u'bid': u'com.x', u'purchase-date-ms': u'1340876762450', u'original-transaction-id': u'1000000051960431', u'bvrs': u'1.0', u'original-purchase-date-ms': u'1340876762450', u'purchase-date': u'2012-06-28 09:46:02 etc/gmt', u'original-purchase-date': u'2012-06-28 09:46:02 etc/gmt', u'item-id': u'523404215', u'quantity': u'1'}
it should handle nested mappings well. assume there no escaped "
quotes in values though. if there you'll need parser anyway.
Comments
Post a Comment