Working with text data

To run your code, click run. It will let you know if your code is correct or not, and will offer hints on what to fix if needed. If you are stuck, you can see the solution by clicking Solution.

Exercise 1: Subsetting data

eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6ImxpYnJhcnkoZHBseXIpXG5kb2MgPC0gXCJPbiBKdW5lIDUsIDIwMTMsIHRoZSBTRUMgYXBwcm92ZWQgdGhlIHB1YmxpY2F0aW9uIG9mIHByb3Bvc2VkIHN0cnVjdHVyYWwgcmVmb3JtcyBvZiBtb25leSBtYXJrZXQgZnVuZHMuIFRoZSBwcm9wb3NhbCBjb25zaWRlcmVkIHR3byByZWZvcm0gYWx0ZXJuYXRpdmVzIHRoYXQgY291bGQgYmUgYWRvcHRlZCBlaXRoZXIgYWxvbmUgb3IgaW4gY29tYmluYXRpb246IChpKSByZXF1aXJpbmcgcHJpbWUgYW5kIHRheC1leGVtcHQgaW5zdGl0dXRpb25hbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gZmxvYXQgdGhlaXIgbmV0IGFzc2V0IHZhbHVlcyBvciAoaWkpIHJlcXVpcmluZyBhbGwgbm9uLWdvdmVybm1lbnRhbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gaW1wb3NlIGxpcXVpZGl0eSBmZWVzIG9mIHVwIHRvIDIlIGFuZCB0byBoYXZlIHRoZSBvcHRpb24gdG8gdGVtcG9yYXJpbHkgc3VzcGVuZCByZWRlbXB0aW9ucyAob3IgZ2F0ZSB0aGUgbW9uZXkgbWFya2V0IGZ1bmQpIHVwb24gdGhlIG9jY3VycmVuY2Ugb2Ygc3BlY2lmaWVkIGV2ZW50cyBpbmRpY2F0aW5nIHRoYXQgdGhlIGZ1bmQgbWF5IGJlIHVuZGVyIHN0cmVzcy4gSXQgaXMgY3VycmVudGx5IGFudGljaXBhdGVkIHRoYXQgdGhlIFNFQyB3aWxsIGFkb3B0IGZpbmFsIHN0cnVjdHVyYWwgcmVmb3JtcyBpbiAyMDE0LiBUaGUgRmluYW5jaWFsIFN0YWJpbGl0eSBCb2FyZCAodGhlIEZTQikgaGFzIGVuZG9yc2VkIGFuZCBwdWJsaXNoZWQgZm9yIHB1YmxpYyBjb25zdWx0YXRpb24gMTUgcG9saWN5IHJlY29tbWVuZGF0aW9ucyBwcm9wb3NlZCBieSB0aGUgSW50ZXJuYXRpb25hbCBPcmdhbml6YXRpb24gb2YgU2VjdXJpdGllcyBDb21taXNzaW9ucywgaW5jbHVkaW5nIHJlcXVpcmluZyBtb25leSBtYXJrZXQgZnVuZHMgdG8gYWRvcHQgYSBmbG9hdGluZyBuZXQgYXNzZXQgdmFsdWUuIEluIGFkZGl0aW9uLCBpbiBTZXB0ZW1iZXIgMjAxMyB0aGUgRXVyb3BlYW4gQ29tbWlzc2lvbiAodGhlIEVDKSByZWxlYXNlZCBhIHByb3Bvc2FsIGZvciBhIG5ldyByZWd1bGF0aW9uIG9uIG1vbmV5IG1hcmtldCBmdW5kcyBpbiB0aGUgRVUuIFRoZSBFQyBwcm9wb3NlZCB0d28gb3B0aW9ucyBmb3Igc3RhYmxlIG5ldCBhc3NldCB2YWx1ZSBtb25leSBtYXJrZXQgZnVuZHM6IGVpdGhlciAoaSkgbWFpbnRhaW4gYSBjYXBpdGFsIGJ1ZmZlciBvZiBhdCBsZWFzdCB0aHJlZSBwZXJjZW50IG9mIGFzc2V0cyB1bmRlciBtYW5hZ2VtZW50IG9yIChpaSkgZmxvYXQgdGhlIG5ldCBhc3NldCB2YWx1ZSBvZiB0aGUgbW9uZXkgbWFya2V0IGZ1bmQuIFRoZSBFQyBwcm9wb3NhbCBpcyBjdXJyZW50bHkgYmVpbmcgcmV2aWV3ZWQgYnkgdGhlIEV1cm9wZWFuIFBhcmxpYW1lbnQgYW5kIHRoZSBDb3VuY2lsIG9mIE1lbWJlciBTdGF0ZXMgYXMgY28tbGVnaXNsYXRvcnMsIGFuZCBpcyBleHBlY3RlZCB0byBiZSBhcHByb3ZlZCBpbiAyMDE0LiBGb3IgZnVydGhlciBpbmZvcm1hdGlvbiBvbiBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMsIHNlZSBTaWduaWZpY2FudCBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMgb24gcGFnZXMgODkgLiBiYW5rIHN1YnNpZGlhcmllcyBwYXkgYW5udWFsbHkgdG8gdGhlIEZESUMuIEZvciBtb3JlIGluZm9ybWF0aW9uLCBzZWUgRGVwb3NpdCBpbnN1cmFuY2Ugb24gcGFnZSA2IC4gXCIiLCJzYW1wbGUiOiIjIFRoZSB2YXJpYWJsZSBiZWxvdyBjb250YWlucyBhIHBhcmFncmFwaCBmcm9tIEpQTSdzIDIwMTQgYW5udWFsIHJlcG9ydFxuY2F0KGRvYylcblxuIyBMb2FkIGluIHN0cmluZ3IgZm9yIGVhc3kgc3RyaW5nIGZ1bmN0aW9uc1xubGlicmFyeShzdHJpbmdyKVxuXG4jIEV4dHJhY3QgdGhlIGZpcnN0IDEwMCBjaGFyYWN0ZXJzIGFuZCBzdG9yZSB0aGVtIGluIGEgdmFyaWFibGUgbmFtZSBmaXJzdFxuZmlyc3QgPC0gXG5cbiMgRXh0cmFjdCB0aGUgbWlkZGxlIDEwMCBjaGFyYWN0ZXJzICg3OTMgdG8gODkyLCBpbmNsdXNpdmUpIGFuZCBzdG9yZSBpbiBhXG4jIFZhcmlhYmxlIG5hbWVkIG1pZGRsZVxubWlkZGxlIDwtIFxuXG4jIEV4dHJhY3QgdGhlIGxhc3QgMTAwIGNoYXJhY3RlcnMgYW5kIHN0b3JlIHRoZW0gaW4gYSB2YXJpYWJsZSBuYW1lZCBsYXN0XG5sYXN0IDwtIFxuXG4jIHByaW50IG91dCB0aGUgc3Vic2V0c1xucHJpbnQocGFzdGUoZmlyc3QsIG1pZGRsZSwgbGFzdCwgc2VwPScgLS0tICcpKVxuI0VORCIsInNvbHV0aW9uIjoiIyBUaGUgdmFyaWFibGUgYmVsb3cgY29udGFpbnMgYSBwYXJhZ3JhcGggZnJvbSBKUE0ncyAyMDE0IGFubnVhbCByZXBvcnRcbmNhdChkb2MpXG5cbiMgTG9hZCBpbiBzdHJpbmdyIGZvciBlYXN5IHN0cmluZyBmdW5jdGlvbnNcbmxpYnJhcnkoc3RyaW5ncilcblxuIyBFeHRyYWN0IHRoZSBmaXJzdCAxMDAgY2hhcmFjdGVycyBhbmQgc3RvcmUgdGhlbSBpbiBhIHZhcmlhYmxlIG5hbWUgZmlyc3RcbmZpcnN0IDwtIHN0cl9zdWIoZG9jLCAxLCAxMDApXG5cbiMgRXh0cmFjdCB0aGUgbWlkZGxlIDEwMCBjaGFyYWN0ZXJzICg3OTMgdG8gODkyLCBpbmNsdXNpdmUpIGFuZCBzdG9yZSBpbiBhXG4jIFZhcmlhYmxlIG5hbWVkIG1pZGRsZVxubWlkZGxlIDwtIHN0cl9zdWIoZG9jLCA3OTMsIDg5MilcblxuIyBFeHRyYWN0IHRoZSBsYXN0IDEwMCBjaGFyYWN0ZXJzIGFuZCBzdG9yZSB0aGVtIGluIGEgdmFyaWFibGUgbmFtZWQgbGFzdFxubGFzdCA8LSBzdHJfc3ViKGRvYywgc3RyX2xlbmd0aChkb2MpLTk5LCBzdHJfbGVuZ3RoKGRvYykpXG5cbiMgcHJpbnQgb3V0IHRoZSBzdWJzZXRzXG5wcmludChwYXN0ZShmaXJzdCwgbWlkZGxlLCBsYXN0LCBzZXA9JyAtLS0gJykpXG4jRU5EIiwic2N0IjoiIyBUZW1wbGF0ZSBiYXNlZCBvbiBodHRwczovL3d3dy5yZG9jdW1lbnRhdGlvbi5vcmcvcGFja2FnZXMvdGVzdHdoYXQvdmVyc2lvbnMvNC4xLjFcbiMgQ2hlY2sgaWYgc29tZXRoaW5nIGlzIGV4cGxpY2l0bHkgdHlwZWRcblxudGVzdF9leHByZXNzaW9uX291dHB1dChcInN0cl9sZW5ndGgoZmlyc3QpXCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYGZpcnN0YCBpc24ndCB0aGUgcmlnaHQgbGVuZ3RoLlwiKVxudGVzdF9leHByZXNzaW9uX291dHB1dChcImZpcnN0XCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYGZpcnN0YCBpc24ndCBxdWl0ZSByaWdodC5cIilcbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJzdHJfbGVuZ3RoKG1pZGRsZSlcIiwgaW5jb3JyZWN0X21zZz1cIkl0IGFwcGVhcnMgeW91ciBgbWlkZGxlYCBpc24ndCB0aGUgcmlnaHQgbGVuZ3RoLlwiKVxudGVzdF9leHByZXNzaW9uX291dHB1dChcIm1pZGRsZVwiLCBpbmNvcnJlY3RfbXNnPVwiSXQgYXBwZWFycyB5b3VyIGBtaWRkbGVgIGlzbid0IHF1aXRlIHJpZ2h0LlwiKVxudGVzdF9leHByZXNzaW9uX291dHB1dChcInN0cl9sZW5ndGgobGFzdClcIiwgaW5jb3JyZWN0X21zZz1cIkl0IGFwcGVhcnMgeW91ciBgbGFzdGAgaXNuJ3QgdGhlIHJpZ2h0IGxlbmd0aC5cIilcbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJsYXN0XCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYGxhc3RgIGlzbid0IHF1aXRlIHJpZ2h0LlwiKVxuXG4jIHRlc3Rfc3R1ZGVudF90eXBlZCgneCA8LSAyJywgbm90X3R5cGVkX21zZz0nJylcblxuIyBDaGVjayBpZiBmdW5jdGlvbiB3YXMgdXNlZCBpbiBpbnB1dCBjb2RlXG4jIHRlc3RfZnVuY3Rpb24oJ2MnLGluY29ycmVjdF9tc2c9JycpICBcblxuIyBSZXF1aXJlcyBhbiBvYmplY3QgYHhgIHRvIGhhdmUgdGhlIHNhbWUgdmFsdWUgYXMgdGhlIHNvbHV0aW9uXG4jIHRlc3Rfb2JqZWN0KFwieFwiLGluY29ycmVjdF9tc2cgPSBcIlwiLHVuZGVmaW5lZF9tc2cgPSBcIlwiKSAgXG5cbiMgUmVxdWlyZXMgYW4gb25qZWN0IHdpdGggdGhlIHNhbWUgdmFsdWUgb2YgYHhgIGluIHRoZSBzb2x1dGlvblxuIyB0ZXN0X2FuX29iamVjdChcInhcIix1bmRlZmluZWRfbXNnPVwiXCIpXG5cbiMgQ2hlY2tzIGlmIG91dHB1dCBvZiBzdHVkZW50J3MgY29kZSBjb250YWlucyBnaXZlbiBldmFsdWF0ZWQgZXhwcmVzc2lvblxuIyB0ZXN0X291dHB1dF9jb250YWlucyhcInhcIixpbmNvcnJlY3RfbXNnID0gXCJcIilcblxuIyBDaGVjayBpZiBhIHZlY3RvciBvZiBwcmVkZWZpbmVkIG9iamVjdHMgYXJlIHVuY2hhbmdlZFxuIyB0ZXN0X3ByZWRlZmluZWRfb2JqZWN0cyhjKCd4JywneScpLGluY29ycmVjdF9tc2c9XCJEb24ndCBvbnZlcndyaXRlIHRoZSBwcmVkZWZpbmVkIHZhcmlhYmxlc1wiKVxuXG4jIENoZWNrcyBmb3IgYSByZWdleCBwYXR0ZXJuIGluIHRyaGUgb3V0cHV0XG4jIHRlc3Rfb3V0cHV0X3JlZ2V4KHBhdHRlcm4sZml4ZWQ9RiwgdGltZXM9MSwgaW5jb3JyZWN0X21zZz0nJylcblxuIyBDYW4gY2hlY2sgYW4gYXJiaXRyYXJ5IGV4cHJlc3Npb24gYWNyb3NzIGJvdGggc29sdXRpb24gYW5kIHN0dWRlbnQgY29kZVxuI3Rlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJ0eXBlb2YoY29tcGFueV9uYW1lKVwiLCBpbmNvcnJlY3RfbXNnPVwiRGlkIHlvdSBzdG9yZSB0ZXh0dWFsIGRhdGEgaW4gYGNvbXBhbnlfbmFtZWA/XCIpXG5cbnRlc3RfZXJyb3IoKVxuc3VjY2Vzc19tc2coXCJBd2Vzb21lIVwiKVxuXG4jIE90aGVyIGZ1bmN0aW9ucyB0byBub3RlOlxuIyAgICAgLSB0ZXN0X29yKGEsYikgLS0gY2hlY2tzIGlmIGVpdGhlciB0ZXN0IGEgb3IgdGVzdCBiIHBhc3NcbiMgICAgIC0gdGVzdF9nZ3Bsb3QoKSAtLSBjYW4gY2hlY2sgaWYgcGxvdHMgYXJlIGNvcnJlY3RcbiMgICAgIC0gdGVzdF9mdW5jdGlvbigpIC0tIGNhbiBhbHNvIGNoZWNrIGluY2x1ZGVkIHBhcmFtZXRlcnNcbiMgICAgIC0gdGVzdF9sb29wKCkgLS0gY2hlY2tpbmcgZm9yIGFuZCB3aGlsZSBsb29wc1xuIyAgICAgLSB0ZXN0X2xpYnJhcnlfZnVuY3Rpb24oJ3BhY2thZ2UnLCBub3RfY2FsbGVkX21zZz0nJyxpbmNvcnJlY3RfbXNnPScnKVxuIyAgICAgLSB0ZXN0X2lmX2Vsc2UoKSAtLSBjaGVja2luZyBpZiBzdGF0ZW1lbnRzXG4jICAgICAtIHRlc3RfZXhwcmVzc2lvbl9lcnJvcigpIC0tIGNhbiBjaGVjayBpZiBmdW5jdGlvbnMgYXJlIHByb3Blcmx5IGRlZmluZWRcbiMgICAgIC0gdGVzdF9vcGVyYXRvcignb3BlcmF0b3InLCksIG5vdF9jYWxsZWRfbXNnPScnLGluY29ycmVjdF9tc2c9JycpXG4jICAgICAtIHRlc3RfZnVuY3Rpb25fZGVmaW5pdGlvbigpIC0tIHJpZ29yb3VzbHkgY2hlY2sgZGVmaW5lZCBmdW5jdGlvblxuIyAgICAgLSB0ZXN0X2RhdGFfZnJhbWUoKSAtLSBjaGVjayBpZiBkYXRhZnJhbWUgW2NvbHVtbnNdIGFyZSBlcXVpdmFsZW50XG4jICAgICAtIHRlc3RfZnVuY3Rpb25fcmVzdWx0LCB0ZXN0X2V4cHJlc3Npb25fcmVzdWx0In0=

Exercise 2: Changing case

eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6ImxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KHN0cmluZ3IpXG5kb2MgPC0gXCJPbiBKdW5lIDUsIDIwMTMsIHRoZSBTRUMgYXBwcm92ZWQgdGhlIHB1YmxpY2F0aW9uIG9mIHByb3Bvc2VkIHN0cnVjdHVyYWwgcmVmb3JtcyBvZiBtb25leSBtYXJrZXQgZnVuZHMuIFRoZSBwcm9wb3NhbCBjb25zaWRlcmVkIHR3byByZWZvcm0gYWx0ZXJuYXRpdmVzIHRoYXQgY291bGQgYmUgYWRvcHRlZCBlaXRoZXIgYWxvbmUgb3IgaW4gY29tYmluYXRpb246IChpKSByZXF1aXJpbmcgcHJpbWUgYW5kIHRheC1leGVtcHQgaW5zdGl0dXRpb25hbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gZmxvYXQgdGhlaXIgbmV0IGFzc2V0IHZhbHVlcyBvciAoaWkpIHJlcXVpcmluZyBhbGwgbm9uLWdvdmVybm1lbnRhbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gaW1wb3NlIGxpcXVpZGl0eSBmZWVzIG9mIHVwIHRvIDIlIGFuZCB0byBoYXZlIHRoZSBvcHRpb24gdG8gdGVtcG9yYXJpbHkgc3VzcGVuZCByZWRlbXB0aW9ucyAob3IgZ2F0ZSB0aGUgbW9uZXkgbWFya2V0IGZ1bmQpIHVwb24gdGhlIG9jY3VycmVuY2Ugb2Ygc3BlY2lmaWVkIGV2ZW50cyBpbmRpY2F0aW5nIHRoYXQgdGhlIGZ1bmQgbWF5IGJlIHVuZGVyIHN0cmVzcy4gSXQgaXMgY3VycmVudGx5IGFudGljaXBhdGVkIHRoYXQgdGhlIFNFQyB3aWxsIGFkb3B0IGZpbmFsIHN0cnVjdHVyYWwgcmVmb3JtcyBpbiAyMDE0LiBUaGUgRmluYW5jaWFsIFN0YWJpbGl0eSBCb2FyZCAodGhlIEZTQikgaGFzIGVuZG9yc2VkIGFuZCBwdWJsaXNoZWQgZm9yIHB1YmxpYyBjb25zdWx0YXRpb24gMTUgcG9saWN5IHJlY29tbWVuZGF0aW9ucyBwcm9wb3NlZCBieSB0aGUgSW50ZXJuYXRpb25hbCBPcmdhbml6YXRpb24gb2YgU2VjdXJpdGllcyBDb21taXNzaW9ucywgaW5jbHVkaW5nIHJlcXVpcmluZyBtb25leSBtYXJrZXQgZnVuZHMgdG8gYWRvcHQgYSBmbG9hdGluZyBuZXQgYXNzZXQgdmFsdWUuIEluIGFkZGl0aW9uLCBpbiBTZXB0ZW1iZXIgMjAxMyB0aGUgRXVyb3BlYW4gQ29tbWlzc2lvbiAodGhlIEVDKSByZWxlYXNlZCBhIHByb3Bvc2FsIGZvciBhIG5ldyByZWd1bGF0aW9uIG9uIG1vbmV5IG1hcmtldCBmdW5kcyBpbiB0aGUgRVUuIFRoZSBFQyBwcm9wb3NlZCB0d28gb3B0aW9ucyBmb3Igc3RhYmxlIG5ldCBhc3NldCB2YWx1ZSBtb25leSBtYXJrZXQgZnVuZHM6IGVpdGhlciAoaSkgbWFpbnRhaW4gYSBjYXBpdGFsIGJ1ZmZlciBvZiBhdCBsZWFzdCB0aHJlZSBwZXJjZW50IG9mIGFzc2V0cyB1bmRlciBtYW5hZ2VtZW50IG9yIChpaSkgZmxvYXQgdGhlIG5ldCBhc3NldCB2YWx1ZSBvZiB0aGUgbW9uZXkgbWFya2V0IGZ1bmQuIFRoZSBFQyBwcm9wb3NhbCBpcyBjdXJyZW50bHkgYmVpbmcgcmV2aWV3ZWQgYnkgdGhlIEV1cm9wZWFuIFBhcmxpYW1lbnQgYW5kIHRoZSBDb3VuY2lsIG9mIE1lbWJlciBTdGF0ZXMgYXMgY28tbGVnaXNsYXRvcnMsIGFuZCBpcyBleHBlY3RlZCB0byBiZSBhcHByb3ZlZCBpbiAyMDE0LiBGb3IgZnVydGhlciBpbmZvcm1hdGlvbiBvbiBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMsIHNlZSBTaWduaWZpY2FudCBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMgb24gcGFnZXMgODkgLiBiYW5rIHN1YnNpZGlhcmllcyBwYXkgYW5udWFsbHkgdG8gdGhlIEZESUMuIEZvciBtb3JlIGluZm9ybWF0aW9uLCBzZWUgRGVwb3NpdCBpbnN1cmFuY2Ugb24gcGFnZSA2IC4gXCJcbmZpcnN0IDwtIHN0cl9zdWIoZG9jLCAxLCAxMDApIiwic2FtcGxlIjoiIyBUaGUgdmFyaWFibGUgZG9jIGNvbnRhaW5zIGEgcGFyYWdyYXBoIGZyb20gSlBNJ3MgMjAxNCBhbm51YWwgcmVwb3J0XG4jIFRoZSB2YXJpYWJsZSBmaXJzdCBmcm9tIHRoZSBwcmV2aW91cyBleGVyY2lzZSBpcyBsb2FkZWRcbmNhdChmaXJzdClcblxuIyBDcmVhdGUgYSB2ZXJzaW9uIG9mIGZpcnN0IHRoYXQgaXMgYWxsIGxvd2VyY2FzZVxubG93ZXIgPC0gXG5cbiMgQ3JlYXRlIGEgdmVyc2lvbiBvZiBmaXJzdCB0aGF0IGlzIGFsbCBVUFBFUkNBU0VcbnVwcGVyIDwtIFxuXG4jIENyZWF0ZSBhIHZlcnNpb24gb2YgZmlyc3QgdGhhdCBpcyBhbGwgVGl0bGVjYXNlXG50aXRsZSA8LSBcblxuIyBwcmludCBvdXQgdGhlIGRpZmZlcmVudCB2ZXJzaW9uc1xucHJpbnQocGFzdGUobG93ZXIsIHVwcGVyLCB0aXRsZSwgc2VwPScgLS0tICcpKVxuICBcbiNFTkQiLCJzb2x1dGlvbiI6IiMgVGhlIHZhcmlhYmxlIGRvYyBjb250YWlucyBhIHBhcmFncmFwaCBmcm9tIEpQTSdzIDIwMTQgYW5udWFsIHJlcG9ydFxuIyBUaGUgdmFyaWFibGUgZmlyc3QgZnJvbSB0aGUgcHJldmlvdXMgZXhlcmNpc2UgaXMgbG9hZGVkXG5jYXQoZmlyc3QpXG5cbiMgQ3JlYXRlIGEgdmVyc2lvbiBvZiBmaXJzdCB0aGF0IGlzIGFsbCBsb3dlcmNhc2Vcbmxvd2VyIDwtIHN0cl90b19sb3dlcihmaXJzdClcblxuIyBDcmVhdGUgYSB2ZXJzaW9uIG9mIGZpcnN0IHRoYXQgaXMgYWxsIFVQUEVSQ0FTRVxudXBwZXIgPC0gc3RyX3RvX3VwcGVyKGZpcnN0KVxuXG4jIENyZWF0ZSBhIHZlcnNpb24gb2YgZmlyc3QgdGhhdCBpcyBhbGwgVGl0bGVjYXNlXG50aXRsZSA8LSBzdHJfdG9fdGl0bGUoZmlyc3QpXG5cbiMgcHJpbnQgb3V0IHRoZSBkaWZmZXJlbnQgdmVyc2lvbnNcbnByaW50KHBhc3RlKGxvd2VyLCB1cHBlciwgdGl0bGUsIHNlcD0nIC0tLSAnKSlcbiAgXG4jRU5EIiwic2N0IjoiIyBUZW1wbGF0ZSBiYXNlZCBvbiBodHRwczovL3d3dy5yZG9jdW1lbnRhdGlvbi5vcmcvcGFja2FnZXMvdGVzdHdoYXQvdmVyc2lvbnMvNC4xLjFcbiMgQ2hlY2sgaWYgc29tZXRoaW5nIGlzIGV4cGxpY2l0bHkgdHlwZWRcblxudGVzdF9leHByZXNzaW9uX291dHB1dChcImxvd2VyXCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYGxvd2VyYCBpc24ndCBxdWl0ZSByaWdodC5cIilcbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJ1cHBlclwiLCBpbmNvcnJlY3RfbXNnPVwiSXQgYXBwZWFycyB5b3VyIGB1cHBlcmAgaXNuJ3QgcXVpdGUgcmlnaHQuXCIpXG50ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwidGl0bGVcIiwgaW5jb3JyZWN0X21zZz1cIkl0IGFwcGVhcnMgeW91ciBgdGl0bGVgIGlzbid0IHF1aXRlIHJpZ2h0LlwiKVxuXG4jIHRlc3Rfc3R1ZGVudF90eXBlZCgneCA8LSAyJywgbm90X3R5cGVkX21zZz0nJylcblxuIyBDaGVjayBpZiBmdW5jdGlvbiB3YXMgdXNlZCBpbiBpbnB1dCBjb2RlXG4jIHRlc3RfZnVuY3Rpb24oJ2MnLGluY29ycmVjdF9tc2c9JycpICBcblxuIyBSZXF1aXJlcyBhbiBvYmplY3QgYHhgIHRvIGhhdmUgdGhlIHNhbWUgdmFsdWUgYXMgdGhlIHNvbHV0aW9uXG4jIHRlc3Rfb2JqZWN0KFwieFwiLGluY29ycmVjdF9tc2cgPSBcIlwiLHVuZGVmaW5lZF9tc2cgPSBcIlwiKSAgXG5cbiMgUmVxdWlyZXMgYW4gb25qZWN0IHdpdGggdGhlIHNhbWUgdmFsdWUgb2YgYHhgIGluIHRoZSBzb2x1dGlvblxuIyB0ZXN0X2FuX29iamVjdChcInhcIix1bmRlZmluZWRfbXNnPVwiXCIpXG5cbiMgQ2hlY2tzIGlmIG91dHB1dCBvZiBzdHVkZW50J3MgY29kZSBjb250YWlucyBnaXZlbiBldmFsdWF0ZWQgZXhwcmVzc2lvblxuIyB0ZXN0X291dHB1dF9jb250YWlucyhcInhcIixpbmNvcnJlY3RfbXNnID0gXCJcIilcblxuIyBDaGVjayBpZiBhIHZlY3RvciBvZiBwcmVkZWZpbmVkIG9iamVjdHMgYXJlIHVuY2hhbmdlZFxuIyB0ZXN0X3ByZWRlZmluZWRfb2JqZWN0cyhjKCd4JywneScpLGluY29ycmVjdF9tc2c9XCJEb24ndCBvbnZlcndyaXRlIHRoZSBwcmVkZWZpbmVkIHZhcmlhYmxlc1wiKVxuXG4jIENoZWNrcyBmb3IgYSByZWdleCBwYXR0ZXJuIGluIHRyaGUgb3V0cHV0XG4jIHRlc3Rfb3V0cHV0X3JlZ2V4KHBhdHRlcm4sZml4ZWQ9RiwgdGltZXM9MSwgaW5jb3JyZWN0X21zZz0nJylcblxuIyBDYW4gY2hlY2sgYW4gYXJiaXRyYXJ5IGV4cHJlc3Npb24gYWNyb3NzIGJvdGggc29sdXRpb24gYW5kIHN0dWRlbnQgY29kZVxuI3Rlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJ0eXBlb2YoY29tcGFueV9uYW1lKVwiLCBpbmNvcnJlY3RfbXNnPVwiRGlkIHlvdSBzdG9yZSB0ZXh0dWFsIGRhdGEgaW4gYGNvbXBhbnlfbmFtZWA/XCIpXG5cbnRlc3RfZXJyb3IoKVxuc3VjY2Vzc19tc2coXCJBd2Vzb21lIVwiKVxuXG4jIE90aGVyIGZ1bmN0aW9ucyB0byBub3RlOlxuIyAgICAgLSB0ZXN0X29yKGEsYikgLS0gY2hlY2tzIGlmIGVpdGhlciB0ZXN0IGEgb3IgdGVzdCBiIHBhc3NcbiMgICAgIC0gdGVzdF9nZ3Bsb3QoKSAtLSBjYW4gY2hlY2sgaWYgcGxvdHMgYXJlIGNvcnJlY3RcbiMgICAgIC0gdGVzdF9mdW5jdGlvbigpIC0tIGNhbiBhbHNvIGNoZWNrIGluY2x1ZGVkIHBhcmFtZXRlcnNcbiMgICAgIC0gdGVzdF9sb29wKCkgLS0gY2hlY2tpbmcgZm9yIGFuZCB3aGlsZSBsb29wc1xuIyAgICAgLSB0ZXN0X2xpYnJhcnlfZnVuY3Rpb24oJ3BhY2thZ2UnLCBub3RfY2FsbGVkX21zZz0nJyxpbmNvcnJlY3RfbXNnPScnKVxuIyAgICAgLSB0ZXN0X2lmX2Vsc2UoKSAtLSBjaGVja2luZyBpZiBzdGF0ZW1lbnRzXG4jICAgICAtIHRlc3RfZXhwcmVzc2lvbl9lcnJvcigpIC0tIGNhbiBjaGVjayBpZiBmdW5jdGlvbnMgYXJlIHByb3Blcmx5IGRlZmluZWRcbiMgICAgIC0gdGVzdF9vcGVyYXRvcignb3BlcmF0b3InLCksIG5vdF9jYWxsZWRfbXNnPScnLGluY29ycmVjdF9tc2c9JycpXG4jICAgICAtIHRlc3RfZnVuY3Rpb25fZGVmaW5pdGlvbigpIC0tIHJpZ29yb3VzbHkgY2hlY2sgZGVmaW5lZCBmdW5jdGlvblxuIyAgICAgLSB0ZXN0X2RhdGFfZnJhbWUoKSAtLSBjaGVjayBpZiBkYXRhZnJhbWUgW2NvbHVtbnNdIGFyZSBlcXVpdmFsZW50XG4jICAgICAtIHRlc3RfZnVuY3Rpb25fcmVzdWx0LCB0ZXN0X2V4cHJlc3Npb25fcmVzdWx0In0=

Exercise 3: Searching for phrases

eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6ImxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KHN0cmluZ3IpXG5kb2MgPC0gXCJPbiBKdW5lIDUsIDIwMTMsIHRoZSBTRUMgYXBwcm92ZWQgdGhlIHB1YmxpY2F0aW9uIG9mIHByb3Bvc2VkIHN0cnVjdHVyYWwgcmVmb3JtcyBvZiBtb25leSBtYXJrZXQgZnVuZHMuIFRoZSBwcm9wb3NhbCBjb25zaWRlcmVkIHR3byByZWZvcm0gYWx0ZXJuYXRpdmVzIHRoYXQgY291bGQgYmUgYWRvcHRlZCBlaXRoZXIgYWxvbmUgb3IgaW4gY29tYmluYXRpb246IChpKSByZXF1aXJpbmcgcHJpbWUgYW5kIHRheC1leGVtcHQgaW5zdGl0dXRpb25hbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gZmxvYXQgdGhlaXIgbmV0IGFzc2V0IHZhbHVlcyBvciAoaWkpIHJlcXVpcmluZyBhbGwgbm9uLWdvdmVybm1lbnRhbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gaW1wb3NlIGxpcXVpZGl0eSBmZWVzIG9mIHVwIHRvIDIlIGFuZCB0byBoYXZlIHRoZSBvcHRpb24gdG8gdGVtcG9yYXJpbHkgc3VzcGVuZCByZWRlbXB0aW9ucyAob3IgZ2F0ZSB0aGUgbW9uZXkgbWFya2V0IGZ1bmQpIHVwb24gdGhlIG9jY3VycmVuY2Ugb2Ygc3BlY2lmaWVkIGV2ZW50cyBpbmRpY2F0aW5nIHRoYXQgdGhlIGZ1bmQgbWF5IGJlIHVuZGVyIHN0cmVzcy4gSXQgaXMgY3VycmVudGx5IGFudGljaXBhdGVkIHRoYXQgdGhlIFNFQyB3aWxsIGFkb3B0IGZpbmFsIHN0cnVjdHVyYWwgcmVmb3JtcyBpbiAyMDE0LiBUaGUgRmluYW5jaWFsIFN0YWJpbGl0eSBCb2FyZCAodGhlIEZTQikgaGFzIGVuZG9yc2VkIGFuZCBwdWJsaXNoZWQgZm9yIHB1YmxpYyBjb25zdWx0YXRpb24gMTUgcG9saWN5IHJlY29tbWVuZGF0aW9ucyBwcm9wb3NlZCBieSB0aGUgSW50ZXJuYXRpb25hbCBPcmdhbml6YXRpb24gb2YgU2VjdXJpdGllcyBDb21taXNzaW9ucywgaW5jbHVkaW5nIHJlcXVpcmluZyBtb25leSBtYXJrZXQgZnVuZHMgdG8gYWRvcHQgYSBmbG9hdGluZyBuZXQgYXNzZXQgdmFsdWUuIEluIGFkZGl0aW9uLCBpbiBTZXB0ZW1iZXIgMjAxMyB0aGUgRXVyb3BlYW4gQ29tbWlzc2lvbiAodGhlIEVDKSByZWxlYXNlZCBhIHByb3Bvc2FsIGZvciBhIG5ldyByZWd1bGF0aW9uIG9uIG1vbmV5IG1hcmtldCBmdW5kcyBpbiB0aGUgRVUuIFRoZSBFQyBwcm9wb3NlZCB0d28gb3B0aW9ucyBmb3Igc3RhYmxlIG5ldCBhc3NldCB2YWx1ZSBtb25leSBtYXJrZXQgZnVuZHM6IGVpdGhlciAoaSkgbWFpbnRhaW4gYSBjYXBpdGFsIGJ1ZmZlciBvZiBhdCBsZWFzdCB0aHJlZSBwZXJjZW50IG9mIGFzc2V0cyB1bmRlciBtYW5hZ2VtZW50IG9yIChpaSkgZmxvYXQgdGhlIG5ldCBhc3NldCB2YWx1ZSBvZiB0aGUgbW9uZXkgbWFya2V0IGZ1bmQuIFRoZSBFQyBwcm9wb3NhbCBpcyBjdXJyZW50bHkgYmVpbmcgcmV2aWV3ZWQgYnkgdGhlIEV1cm9wZWFuIFBhcmxpYW1lbnQgYW5kIHRoZSBDb3VuY2lsIG9mIE1lbWJlciBTdGF0ZXMgYXMgY28tbGVnaXNsYXRvcnMsIGFuZCBpcyBleHBlY3RlZCB0byBiZSBhcHByb3ZlZCBpbiAyMDE0LiBGb3IgZnVydGhlciBpbmZvcm1hdGlvbiBvbiBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMsIHNlZSBTaWduaWZpY2FudCBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMgb24gcGFnZXMgODkgLiBiYW5rIHN1YnNpZGlhcmllcyBwYXkgYW5udWFsbHkgdG8gdGhlIEZESUMuIEZvciBtb3JlIGluZm9ybWF0aW9uLCBzZWUgRGVwb3NpdCBpbnN1cmFuY2Ugb24gcGFnZSA2IC4gXCJcbmZpcnN0IDwtIHN0cl9zdWIoZG9jLCAxLCAxMDApIiwic2FtcGxlIjoiIyBUaGUgdmFyaWFibGUgZG9jIGNvbnRhaW5zIGEgcGFyYWdyYXBoIGZyb20gSlBNJ3MgMjAxNCBhbm51YWwgcmVwb3J0XG5jYXQoZG9jKVxuXG4jIEhvdyBtYW55IHRpbWVzIGRvZXMgdGhlIHBhcmFncmFwaCBtZW50aW9uIHRoZSBcIlNFQ1wiP1xuU0VDX21lbnRpb25zIDwtIFxuXG4jIFdoZXJlIGlzIHRoZSBTRUMgbWVudGlvbmVkPyAgKEFsbCBsb2NhdGlvbnMpXG5TRUNfbG9jYXRpb24gPC0gXG5cbiMgcHJpbnQgb3V0IHRoZSByZXN1bHRzXG5TRUNfbWVudGlvbnNcblNFQ19sb2NhdGlvblxuXG4jRU5EIiwic29sdXRpb24iOiIjIFRoZSB2YXJpYWJsZSBkb2MgY29udGFpbnMgYSBwYXJhZ3JhcGggZnJvbSBKUE0ncyAyMDE0IGFubnVhbCByZXBvcnRcbmNhdChkb2MpXG5cbiMgSG93IG1hbnkgdGltZXMgZG9lcyB0aGUgcGFyYWdyYXBoIG1lbnRpb24gdGhlIFwiU0VDXCI/XG5TRUNfbWVudGlvbnMgPC0gc3RyX2NvdW50KGRvYywgXCJTRUNcIilcblxuIyBXaGVyZSBpcyB0aGUgU0VDIG1lbnRpb25lZD8gIChBbGwgbG9jYXRpb25zKVxuU0VDX2xvY2F0aW9uIDwtIHN0cl9sb2NhdGVfYWxsKGRvYywgXCJTRUNcIilcblxuIyBwcmludCBvdXQgdGhlIHJlc3VsdHNcblNFQ19tZW50aW9uc1xuU0VDX2xvY2F0aW9uXG5cbiNFTkQiLCJzY3QiOiIjIFRlbXBsYXRlIGJhc2VkIG9uIGh0dHBzOi8vd3d3LnJkb2N1bWVudGF0aW9uLm9yZy9wYWNrYWdlcy90ZXN0d2hhdC92ZXJzaW9ucy80LjEuMVxuIyBDaGVjayBpZiBzb21ldGhpbmcgaXMgZXhwbGljaXRseSB0eXBlZFxuXG50ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwiU0VDX21lbnRpb25zXCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYFNFQ19tZW50aW9uc2AgaXNuJ3QgcXVpdGUgcmlnaHQuXCIpXG50ZXN0X3N0dWRlbnRfdHlwZWQoXCJzdHJfbG9jYXRlX2FsbFwiLCBub3RfdHlwZWRfbXNnPVwiRGlkIHlvdSByZW1lbWJlciB0byB1c2UgYHN0cl9sb2NhdGVfYWxsKClgIHNvIGFzIHRvIGdldCBBTEwgbWVudGlvbnMnIGxvY2F0aW9ucz9cIilcbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJTRUNfbG9jYXRpb25cIiwgaW5jb3JyZWN0X21zZz1cIkl0IGFwcGVhcnMgeW91ciBgU0VDX2xvY2F0aW9uYCBpc24ndCBxdWl0ZSByaWdodC5cIilcblxuIyB0ZXN0X3N0dWRlbnRfdHlwZWQoJ3ggPC0gMicsIG5vdF90eXBlZF9tc2c9JycpXG5cbiMgQ2hlY2sgaWYgZnVuY3Rpb24gd2FzIHVzZWQgaW4gaW5wdXQgY29kZVxuIyB0ZXN0X2Z1bmN0aW9uKCdjJyxpbmNvcnJlY3RfbXNnPScnKSAgXG5cbiMgUmVxdWlyZXMgYW4gb2JqZWN0IGB4YCB0byBoYXZlIHRoZSBzYW1lIHZhbHVlIGFzIHRoZSBzb2x1dGlvblxuIyB0ZXN0X29iamVjdChcInhcIixpbmNvcnJlY3RfbXNnID0gXCJcIix1bmRlZmluZWRfbXNnID0gXCJcIikgIFxuXG4jIFJlcXVpcmVzIGFuIG9uamVjdCB3aXRoIHRoZSBzYW1lIHZhbHVlIG9mIGB4YCBpbiB0aGUgc29sdXRpb25cbiMgdGVzdF9hbl9vYmplY3QoXCJ4XCIsdW5kZWZpbmVkX21zZz1cIlwiKVxuXG4jIENoZWNrcyBpZiBvdXRwdXQgb2Ygc3R1ZGVudCdzIGNvZGUgY29udGFpbnMgZ2l2ZW4gZXZhbHVhdGVkIGV4cHJlc3Npb25cbiMgdGVzdF9vdXRwdXRfY29udGFpbnMoXCJ4XCIsaW5jb3JyZWN0X21zZyA9IFwiXCIpXG5cbiMgQ2hlY2sgaWYgYSB2ZWN0b3Igb2YgcHJlZGVmaW5lZCBvYmplY3RzIGFyZSB1bmNoYW5nZWRcbiMgdGVzdF9wcmVkZWZpbmVkX29iamVjdHMoYygneCcsJ3knKSxpbmNvcnJlY3RfbXNnPVwiRG9uJ3Qgb252ZXJ3cml0ZSB0aGUgcHJlZGVmaW5lZCB2YXJpYWJsZXNcIilcblxuIyBDaGVja3MgZm9yIGEgcmVnZXggcGF0dGVybiBpbiB0cmhlIG91dHB1dFxuIyB0ZXN0X291dHB1dF9yZWdleChwYXR0ZXJuLGZpeGVkPUYsIHRpbWVzPTEsIGluY29ycmVjdF9tc2c9JycpXG5cbiMgQ2FuIGNoZWNrIGFuIGFyYml0cmFyeSBleHByZXNzaW9uIGFjcm9zcyBib3RoIHNvbHV0aW9uIGFuZCBzdHVkZW50IGNvZGVcbiN0ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwidHlwZW9mKGNvbXBhbnlfbmFtZSlcIiwgaW5jb3JyZWN0X21zZz1cIkRpZCB5b3Ugc3RvcmUgdGV4dHVhbCBkYXRhIGluIGBjb21wYW55X25hbWVgP1wiKVxuXG50ZXN0X2Vycm9yKClcbnN1Y2Nlc3NfbXNnKFwiQXdlc29tZSFcIilcblxuIyBPdGhlciBmdW5jdGlvbnMgdG8gbm90ZTpcbiMgICAgIC0gdGVzdF9vcihhLGIpIC0tIGNoZWNrcyBpZiBlaXRoZXIgdGVzdCBhIG9yIHRlc3QgYiBwYXNzXG4jICAgICAtIHRlc3RfZ2dwbG90KCkgLS0gY2FuIGNoZWNrIGlmIHBsb3RzIGFyZSBjb3JyZWN0XG4jICAgICAtIHRlc3RfZnVuY3Rpb24oKSAtLSBjYW4gYWxzbyBjaGVjayBpbmNsdWRlZCBwYXJhbWV0ZXJzXG4jICAgICAtIHRlc3RfbG9vcCgpIC0tIGNoZWNraW5nIGZvciBhbmQgd2hpbGUgbG9vcHNcbiMgICAgIC0gdGVzdF9saWJyYXJ5X2Z1bmN0aW9uKCdwYWNrYWdlJywgbm90X2NhbGxlZF9tc2c9JycsaW5jb3JyZWN0X21zZz0nJylcbiMgICAgIC0gdGVzdF9pZl9lbHNlKCkgLS0gY2hlY2tpbmcgaWYgc3RhdGVtZW50c1xuIyAgICAgLSB0ZXN0X2V4cHJlc3Npb25fZXJyb3IoKSAtLSBjYW4gY2hlY2sgaWYgZnVuY3Rpb25zIGFyZSBwcm9wZXJseSBkZWZpbmVkXG4jICAgICAtIHRlc3Rfb3BlcmF0b3IoJ29wZXJhdG9yJywpLCBub3RfY2FsbGVkX21zZz0nJyxpbmNvcnJlY3RfbXNnPScnKVxuIyAgICAgLSB0ZXN0X2Z1bmN0aW9uX2RlZmluaXRpb24oKSAtLSByaWdvcm91c2x5IGNoZWNrIGRlZmluZWQgZnVuY3Rpb25cbiMgICAgIC0gdGVzdF9kYXRhX2ZyYW1lKCkgLS0gY2hlY2sgaWYgZGF0YWZyYW1lIFtjb2x1bW5zXSBhcmUgZXF1aXZhbGVudFxuIyAgICAgLSB0ZXN0X2Z1bmN0aW9uX3Jlc3VsdCwgdGVzdF9leHByZXNzaW9uX3Jlc3VsdCJ9

Regular expressions

Exercise 4: Finding mentions by pattern

eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6ImxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KHN0cmluZ3IpXG5kb2MgPC0gXCJPbiBKdW5lIDUsIDIwMTMsIHRoZSBTRUMgYXBwcm92ZWQgdGhlIHB1YmxpY2F0aW9uIG9mIHByb3Bvc2VkIHN0cnVjdHVyYWwgcmVmb3JtcyBvZiBtb25leSBtYXJrZXQgZnVuZHMuIFRoZSBwcm9wb3NhbCBjb25zaWRlcmVkIHR3byByZWZvcm0gYWx0ZXJuYXRpdmVzIHRoYXQgY291bGQgYmUgYWRvcHRlZCBlaXRoZXIgYWxvbmUgb3IgaW4gY29tYmluYXRpb246IChpKSByZXF1aXJpbmcgcHJpbWUgYW5kIHRheC1leGVtcHQgaW5zdGl0dXRpb25hbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gZmxvYXQgdGhlaXIgbmV0IGFzc2V0IHZhbHVlcyBvciAoaWkpIHJlcXVpcmluZyBhbGwgbm9uLWdvdmVybm1lbnRhbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gaW1wb3NlIGxpcXVpZGl0eSBmZWVzIG9mIHVwIHRvIDIlIGFuZCB0byBoYXZlIHRoZSBvcHRpb24gdG8gdGVtcG9yYXJpbHkgc3VzcGVuZCByZWRlbXB0aW9ucyAob3IgZ2F0ZSB0aGUgbW9uZXkgbWFya2V0IGZ1bmQpIHVwb24gdGhlIG9jY3VycmVuY2Ugb2Ygc3BlY2lmaWVkIGV2ZW50cyBpbmRpY2F0aW5nIHRoYXQgdGhlIGZ1bmQgbWF5IGJlIHVuZGVyIHN0cmVzcy4gSXQgaXMgY3VycmVudGx5IGFudGljaXBhdGVkIHRoYXQgdGhlIFNFQyB3aWxsIGFkb3B0IGZpbmFsIHN0cnVjdHVyYWwgcmVmb3JtcyBpbiAyMDE0LiBUaGUgRmluYW5jaWFsIFN0YWJpbGl0eSBCb2FyZCAodGhlIEZTQikgaGFzIGVuZG9yc2VkIGFuZCBwdWJsaXNoZWQgZm9yIHB1YmxpYyBjb25zdWx0YXRpb24gMTUgcG9saWN5IHJlY29tbWVuZGF0aW9ucyBwcm9wb3NlZCBieSB0aGUgSW50ZXJuYXRpb25hbCBPcmdhbml6YXRpb24gb2YgU2VjdXJpdGllcyBDb21taXNzaW9ucywgaW5jbHVkaW5nIHJlcXVpcmluZyBtb25leSBtYXJrZXQgZnVuZHMgdG8gYWRvcHQgYSBmbG9hdGluZyBuZXQgYXNzZXQgdmFsdWUuIEluIGFkZGl0aW9uLCBpbiBTZXB0ZW1iZXIgMjAxMyB0aGUgRXVyb3BlYW4gQ29tbWlzc2lvbiAodGhlIEVDKSByZWxlYXNlZCBhIHByb3Bvc2FsIGZvciBhIG5ldyByZWd1bGF0aW9uIG9uIG1vbmV5IG1hcmtldCBmdW5kcyBpbiB0aGUgRVUuIFRoZSBFQyBwcm9wb3NlZCB0d28gb3B0aW9ucyBmb3Igc3RhYmxlIG5ldCBhc3NldCB2YWx1ZSBtb25leSBtYXJrZXQgZnVuZHM6IGVpdGhlciAoaSkgbWFpbnRhaW4gYSBjYXBpdGFsIGJ1ZmZlciBvZiBhdCBsZWFzdCB0aHJlZSBwZXJjZW50IG9mIGFzc2V0cyB1bmRlciBtYW5hZ2VtZW50IG9yIChpaSkgZmxvYXQgdGhlIG5ldCBhc3NldCB2YWx1ZSBvZiB0aGUgbW9uZXkgbWFya2V0IGZ1bmQuIFRoZSBFQyBwcm9wb3NhbCBpcyBjdXJyZW50bHkgYmVpbmcgcmV2aWV3ZWQgYnkgdGhlIEV1cm9wZWFuIFBhcmxpYW1lbnQgYW5kIHRoZSBDb3VuY2lsIG9mIE1lbWJlciBTdGF0ZXMgYXMgY28tbGVnaXNsYXRvcnMsIGFuZCBpcyBleHBlY3RlZCB0byBiZSBhcHByb3ZlZCBpbiAyMDE0LiBGb3IgZnVydGhlciBpbmZvcm1hdGlvbiBvbiBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMsIHNlZSBTaWduaWZpY2FudCBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMgb24gcGFnZXMgODkgLiBiYW5rIHN1YnNpZGlhcmllcyBwYXkgYW5udWFsbHkgdG8gdGhlIEZESUMuIEZvciBtb3JlIGluZm9ybWF0aW9uLCBzZWUgRGVwb3NpdCBpbnN1cmFuY2Ugb24gcGFnZSA2IC4gXCJcbmZpcnN0IDwtIHN0cl9zdWIoZG9jLCAxLCAxMDApIiwic2FtcGxlIjoiIyBUaGUgdmFyaWFibGUgZG9jIGNvbnRhaW5zIGEgcGFyYWdyYXBoIGZyb20gSlBNJ3MgMjAxNCBhbm51YWwgcmVwb3J0XG5jYXQoZG9jKVxuXG4jIHdoYXQgYXJlIHRoZSBwYWdlIHJlZmVyZW5jZXMgaW4gdGhlIHRleHQ/ICBUaGVzZSBhcmUgb2YgdGhlIGZvcm0gXCJwYWdlICNcIiBvclxuIyBcInBhZ2VzICNcIi5cbnBhZ2VfcmVmZXJlbmNlcyA8LSBzdHJfZXh0cmFjdF9hbGwoKVxuXG4jIEhvdyB3YXMgdGhlIFNFQyBtZW50aW9uZWQ/IEV4dHJhY3QgdGhlIHRleHQgZnJvbSB0d28gd29yZHMgYmVmb3JlIFwiU0VDXCIgaXNcbiMgbWVudGlvbmVkIHVudGlsIHR3byB3b3JkcyBhZnRlciB0aGUgU0VDIGlzIG1lbnRpb25lZFxuU0VDX3JlZmVyZW5jZXMgPC0gc3RyX2V4dHJhY3RfYWxsKClcblxuIyBwcmludCBvdXQgdGhlIHJlc3VsdHNcbnBhZ2VfcmVmZXJlbmNlc1xuU0VDX3JlZmVyZW5jZXNcblxuI0VORCIsInNvbHV0aW9uIjoiIyBUaGUgdmFyaWFibGUgZG9jIGNvbnRhaW5zIGEgcGFyYWdyYXBoIGZyb20gSlBNJ3MgMjAxNCBhbm51YWwgcmVwb3J0XG5jYXQoZG9jKVxuXG4jIHdoYXQgYXJlIHRoZSBwYWdlIHJlZmVyZW5jZXMgaW4gdGhlIHRleHQ/ICBUaGVzZSBhcmUgb2YgdGhlIGZvcm0gXCJwYWdlICNcIiBvclxuIyBcInBhZ2VzICNcIi5cbnBhZ2VfcmVmZXJlbmNlcyA8LSBzdHJfZXh0cmFjdF9hbGwoZG9jLCBcInBhZ2VzP1s6Ymxhbms6XVs6ZGlnaXQ6XStcIilcblxuIyBIb3cgd2FzIHRoZSBTRUMgbWVudGlvbmVkPyBFeHRyYWN0IHRoZSB0ZXh0IGZyb20gdHdvIHdvcmRzIGJlZm9yZSBcIlNFQ1wiIGlzXG4jIG1lbnRpb25lZCB1bnRpbCB0d28gd29yZHMgYWZ0ZXIgdGhlIFNFQyBpcyBtZW50aW9uZWRcblNFQ19yZWZlcmVuY2VzIDwtIHN0cl9leHRyYWN0X2FsbChkb2MsIFwiWzpncmFwaDpdK1s6Ymxhbms6XVs6Z3JhcGg6XStbOmJsYW5rOl1TRUNbOmJsYW5rOl1bOmdyYXBoOl0rWzpibGFuazpdWzpncmFwaDpdK1wiKVxuXG4jIHByaW50IG91dCB0aGUgcmVzdWx0c1xucGFnZV9yZWZlcmVuY2VzXG5TRUNfcmVmZXJlbmNlc1xuXG4jRU5EIiwic2N0IjoiIyBUZW1wbGF0ZSBiYXNlZCBvbiBodHRwczovL3d3dy5yZG9jdW1lbnRhdGlvbi5vcmcvcGFja2FnZXMvdGVzdHdoYXQvdmVyc2lvbnMvNC4xLjFcbiMgQ2hlY2sgaWYgc29tZXRoaW5nIGlzIGV4cGxpY2l0bHkgdHlwZWRcblxudGVzdF9leHByZXNzaW9uX291dHB1dChcImxlbmd0aChwYWdlX3JlZmVyZW5jZXNbWzFdXSlcIiwgaW5jb3JyZWN0X21zZz1cImBwYWdlX3JlZmVyZW5jZXNgIHNob3VsZCBlbmQgdXAgd2l0aCAyIG1hdGNoZXMuXCIpXG50ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwicGFnZV9yZWZlcmVuY2VzXCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYFNFQ19tZW50aW9uc2AgaXNuJ3QgcXVpdGUgcmlnaHQsIGJ1dCBpdCBoYXMgdGhlIHJpZ2h0IG51bWJlciBvZiBtYXRjaGVzLlwiKVxudGVzdF9leHByZXNzaW9uX291dHB1dChcImxlbmd0aChTRUNfcmVmZXJlbmNlc1tbMV1dKVwiLCBpbmNvcnJlY3RfbXNnPVwiYFNFQ19yZWZlcmVuY2VzYCBzaG91bGQgZW5kIHVwIHdpdGggMiBtYXRjaGVzLlwiKVxudGVzdF9leHByZXNzaW9uX291dHB1dChcIlNFQ19yZWZlcmVuY2VzXCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYFNFQ19sb2NhdGlvbmAgaXNuJ3QgcXVpdGUgcmlnaHQsIGJ1dCBpdCBoYXMgdGhlIHJpZ2h0IG51bWJlciBvZiBtYXRjaGVzLlwiKVxuXG4jIHRlc3Rfc3R1ZGVudF90eXBlZCgneCA8LSAyJywgbm90X3R5cGVkX21zZz0nJylcblxuIyBDaGVjayBpZiBmdW5jdGlvbiB3YXMgdXNlZCBpbiBpbnB1dCBjb2RlXG4jIHRlc3RfZnVuY3Rpb24oJ2MnLGluY29ycmVjdF9tc2c9JycpICBcblxuIyBSZXF1aXJlcyBhbiBvYmplY3QgYHhgIHRvIGhhdmUgdGhlIHNhbWUgdmFsdWUgYXMgdGhlIHNvbHV0aW9uXG4jIHRlc3Rfb2JqZWN0KFwieFwiLGluY29ycmVjdF9tc2cgPSBcIlwiLHVuZGVmaW5lZF9tc2cgPSBcIlwiKSAgXG5cbiMgUmVxdWlyZXMgYW4gb25qZWN0IHdpdGggdGhlIHNhbWUgdmFsdWUgb2YgYHhgIGluIHRoZSBzb2x1dGlvblxuIyB0ZXN0X2FuX29iamVjdChcInhcIix1bmRlZmluZWRfbXNnPVwiXCIpXG5cbiMgQ2hlY2tzIGlmIG91dHB1dCBvZiBzdHVkZW50J3MgY29kZSBjb250YWlucyBnaXZlbiBldmFsdWF0ZWQgZXhwcmVzc2lvblxuIyB0ZXN0X291dHB1dF9jb250YWlucyhcInhcIixpbmNvcnJlY3RfbXNnID0gXCJcIilcblxuIyBDaGVjayBpZiBhIHZlY3RvciBvZiBwcmVkZWZpbmVkIG9iamVjdHMgYXJlIHVuY2hhbmdlZFxuIyB0ZXN0X3ByZWRlZmluZWRfb2JqZWN0cyhjKCd4JywneScpLGluY29ycmVjdF9tc2c9XCJEb24ndCBvbnZlcndyaXRlIHRoZSBwcmVkZWZpbmVkIHZhcmlhYmxlc1wiKVxuXG4jIENoZWNrcyBmb3IgYSByZWdleCBwYXR0ZXJuIGluIHRyaGUgb3V0cHV0XG4jIHRlc3Rfb3V0cHV0X3JlZ2V4KHBhdHRlcm4sZml4ZWQ9RiwgdGltZXM9MSwgaW5jb3JyZWN0X21zZz0nJylcblxuIyBDYW4gY2hlY2sgYW4gYXJiaXRyYXJ5IGV4cHJlc3Npb24gYWNyb3NzIGJvdGggc29sdXRpb24gYW5kIHN0dWRlbnQgY29kZVxuI3Rlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJ0eXBlb2YoY29tcGFueV9uYW1lKVwiLCBpbmNvcnJlY3RfbXNnPVwiRGlkIHlvdSBzdG9yZSB0ZXh0dWFsIGRhdGEgaW4gYGNvbXBhbnlfbmFtZWA/XCIpXG5cbnRlc3RfZXJyb3IoKVxuc3VjY2Vzc19tc2coXCJBd2Vzb21lIVwiKVxuXG4jIE90aGVyIGZ1bmN0aW9ucyB0byBub3RlOlxuIyAgICAgLSB0ZXN0X29yKGEsYikgLS0gY2hlY2tzIGlmIGVpdGhlciB0ZXN0IGEgb3IgdGVzdCBiIHBhc3NcbiMgICAgIC0gdGVzdF9nZ3Bsb3QoKSAtLSBjYW4gY2hlY2sgaWYgcGxvdHMgYXJlIGNvcnJlY3RcbiMgICAgIC0gdGVzdF9mdW5jdGlvbigpIC0tIGNhbiBhbHNvIGNoZWNrIGluY2x1ZGVkIHBhcmFtZXRlcnNcbiMgICAgIC0gdGVzdF9sb29wKCkgLS0gY2hlY2tpbmcgZm9yIGFuZCB3aGlsZSBsb29wc1xuIyAgICAgLSB0ZXN0X2xpYnJhcnlfZnVuY3Rpb24oJ3BhY2thZ2UnLCBub3RfY2FsbGVkX21zZz0nJyxpbmNvcnJlY3RfbXNnPScnKVxuIyAgICAgLSB0ZXN0X2lmX2Vsc2UoKSAtLSBjaGVja2luZyBpZiBzdGF0ZW1lbnRzXG4jICAgICAtIHRlc3RfZXhwcmVzc2lvbl9lcnJvcigpIC0tIGNhbiBjaGVjayBpZiBmdW5jdGlvbnMgYXJlIHByb3Blcmx5IGRlZmluZWRcbiMgICAgIC0gdGVzdF9vcGVyYXRvcignb3BlcmF0b3InLCksIG5vdF9jYWxsZWRfbXNnPScnLGluY29ycmVjdF9tc2c9JycpXG4jICAgICAtIHRlc3RfZnVuY3Rpb25fZGVmaW5pdGlvbigpIC0tIHJpZ29yb3VzbHkgY2hlY2sgZGVmaW5lZCBmdW5jdGlvblxuIyAgICAgLSB0ZXN0X2RhdGFfZnJhbWUoKSAtLSBjaGVjayBpZiBkYXRhZnJhbWUgW2NvbHVtbnNdIGFyZSBlcXVpdmFsZW50XG4jICAgICAtIHRlc3RfZnVuY3Rpb25fcmVzdWx0LCB0ZXN0X2V4cHJlc3Npb25fcmVzdWx0In0=

Exercide 5: Further regex practice

eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6ImxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KHN0cmluZ3IpXG5kb2MgPC0gXCJPbiBKdW5lIDUsIDIwMTMsIHRoZSBTRUMgYXBwcm92ZWQgdGhlIHB1YmxpY2F0aW9uIG9mIHByb3Bvc2VkIHN0cnVjdHVyYWwgcmVmb3JtcyBvZiBtb25leSBtYXJrZXQgZnVuZHMuIFRoZSBwcm9wb3NhbCBjb25zaWRlcmVkIHR3byByZWZvcm0gYWx0ZXJuYXRpdmVzIHRoYXQgY291bGQgYmUgYWRvcHRlZCBlaXRoZXIgYWxvbmUgb3IgaW4gY29tYmluYXRpb246IChpKSByZXF1aXJpbmcgcHJpbWUgYW5kIHRheC1leGVtcHQgaW5zdGl0dXRpb25hbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gZmxvYXQgdGhlaXIgbmV0IGFzc2V0IHZhbHVlcyBvciAoaWkpIHJlcXVpcmluZyBhbGwgbm9uLWdvdmVybm1lbnRhbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gaW1wb3NlIGxpcXVpZGl0eSBmZWVzIG9mIHVwIHRvIDIlIGFuZCB0byBoYXZlIHRoZSBvcHRpb24gdG8gdGVtcG9yYXJpbHkgc3VzcGVuZCByZWRlbXB0aW9ucyAob3IgZ2F0ZSB0aGUgbW9uZXkgbWFya2V0IGZ1bmQpIHVwb24gdGhlIG9jY3VycmVuY2Ugb2Ygc3BlY2lmaWVkIGV2ZW50cyBpbmRpY2F0aW5nIHRoYXQgdGhlIGZ1bmQgbWF5IGJlIHVuZGVyIHN0cmVzcy4gSXQgaXMgY3VycmVudGx5IGFudGljaXBhdGVkIHRoYXQgdGhlIFNFQyB3aWxsIGFkb3B0IGZpbmFsIHN0cnVjdHVyYWwgcmVmb3JtcyBpbiAyMDE0LiBUaGUgRmluYW5jaWFsIFN0YWJpbGl0eSBCb2FyZCAodGhlIEZTQikgaGFzIGVuZG9yc2VkIGFuZCBwdWJsaXNoZWQgZm9yIHB1YmxpYyBjb25zdWx0YXRpb24gMTUgcG9saWN5IHJlY29tbWVuZGF0aW9ucyBwcm9wb3NlZCBieSB0aGUgSW50ZXJuYXRpb25hbCBPcmdhbml6YXRpb24gb2YgU2VjdXJpdGllcyBDb21taXNzaW9ucywgaW5jbHVkaW5nIHJlcXVpcmluZyBtb25leSBtYXJrZXQgZnVuZHMgdG8gYWRvcHQgYSBmbG9hdGluZyBuZXQgYXNzZXQgdmFsdWUuIEluIGFkZGl0aW9uLCBpbiBTZXB0ZW1iZXIgMjAxMyB0aGUgRXVyb3BlYW4gQ29tbWlzc2lvbiAodGhlIEVDKSByZWxlYXNlZCBhIHByb3Bvc2FsIGZvciBhIG5ldyByZWd1bGF0aW9uIG9uIG1vbmV5IG1hcmtldCBmdW5kcyBpbiB0aGUgRVUuIFRoZSBFQyBwcm9wb3NlZCB0d28gb3B0aW9ucyBmb3Igc3RhYmxlIG5ldCBhc3NldCB2YWx1ZSBtb25leSBtYXJrZXQgZnVuZHM6IGVpdGhlciAoaSkgbWFpbnRhaW4gYSBjYXBpdGFsIGJ1ZmZlciBvZiBhdCBsZWFzdCB0aHJlZSBwZXJjZW50IG9mIGFzc2V0cyB1bmRlciBtYW5hZ2VtZW50IG9yIChpaSkgZmxvYXQgdGhlIG5ldCBhc3NldCB2YWx1ZSBvZiB0aGUgbW9uZXkgbWFya2V0IGZ1bmQuIFRoZSBFQyBwcm9wb3NhbCBpcyBjdXJyZW50bHkgYmVpbmcgcmV2aWV3ZWQgYnkgdGhlIEV1cm9wZWFuIFBhcmxpYW1lbnQgYW5kIHRoZSBDb3VuY2lsIG9mIE1lbWJlciBTdGF0ZXMgYXMgY28tbGVnaXNsYXRvcnMsIGFuZCBpcyBleHBlY3RlZCB0byBiZSBhcHByb3ZlZCBpbiAyMDE0LiBGb3IgZnVydGhlciBpbmZvcm1hdGlvbiBvbiBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMsIHNlZSBTaWduaWZpY2FudCBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMgb24gcGFnZXMgODkgLiBiYW5rIHN1YnNpZGlhcmllcyBwYXkgYW5udWFsbHkgdG8gdGhlIEZESUMuIEZvciBtb3JlIGluZm9ybWF0aW9uLCBzZWUgRGVwb3NpdCBpbnN1cmFuY2Ugb24gcGFnZSA2IC4gXCJcbmZpcnN0IDwtIHN0cl9zdWIoZG9jLCAxLCAxMDApIiwic2FtcGxlIjoiIyBUaGUgdmFyaWFibGUgZG9jIGNvbnRhaW5zIGEgcGFyYWdyYXBoIGZyb20gSlBNJ3MgMjAxNCBhbm51YWwgcmVwb3J0XG5jYXQoZG9jKVxuXG4jIEZpbmQgYWxsIG51bWJlcnMgKGp1c3QgdGhlIG51bWJlcilcbnJlZ2V4MSA8LSBzdHJfZXh0cmFjdF9hbGwoKVxuXG4jIEZpbmQgYWxsIHBocmFzZXMgaW4gcGFyZW50aGVzZXMsIGluY2x1ZGluZyB0aGUgcGFyZW50aGVzZXMgdGhlbXNlbHZlc1xucmVnZXgyIDwtIHN0cl9leHRyYWN0X2FsbCgpXG5cbiMgRmluZCBhbGwgd29yZHMgdGhhdCBpbmNsdWRlIGEgZG91YmxlZCBsZXR0ZXIsIGxpa2UgXCJzc1wiIG9yIFwib29cIlxuIyBcInBhZ2VzICNcIi5cbnJlZ2V4MyA8LSBzdHJfZXh0cmFjdF9hbGwoKVxuXG4jIHByaW50IG91dCB0aGUgcmVzdWx0c1xucmVnZXgxXG5yZWdleDJcbnJlZ2V4M1xuXG4jRU5EIiwic29sdXRpb24iOiIjIFRoZSB2YXJpYWJsZSBkb2MgY29udGFpbnMgYSBwYXJhZ3JhcGggZnJvbSBKUE0ncyAyMDE0IGFubnVhbCByZXBvcnRcbmNhdChkb2MpXG5cbiMgRmluZCBhbGwgbnVtYmVycyAoanVzdCB0aGUgbnVtYmVyKVxucmVnZXgxIDwtIHN0cl9leHRyYWN0X2FsbChkb2MsIFwiWzpkaWdpdDpdK1wiKVxuXG4jIEZpbmQgYWxsIHBocmFzZXMgaW4gcGFyZW50aGVzZXMsIGluY2x1ZGluZyB0aGUgcGFyZW50aGVzZXMgdGhlbXNlbHZlc1xucmVnZXgyIDwtIHN0cl9leHRyYWN0X2FsbChkb2MsIFwiXFxcXCguKj9cXFxcKVwiKVxuXG4jIEZpbmQgYWxsIHdvcmRzIHRoYXQgaW5jbHVkZSBhIGRvdWJsZWQgbGV0dGVyLCBsaWtlIFwic3NcIiBvciBcIm9vXCJcbiMgXCJwYWdlcyAjXCIuXG5yZWdleDMgPC0gc3RyX2V4dHJhY3RfYWxsKGRvYywgXCJbOmFscGhhOl0qKFs6YWxwaGE6XSlcXFxcMVs6YWxwaGE6XSpcIilcblxuIyBwcmludCBvdXQgdGhlIHJlc3VsdHNcbnJlZ2V4MVxucmVnZXgyXG5yZWdleDNcblxuI0VORCIsInNjdCI6IiMgVGVtcGxhdGUgYmFzZWQgb24gaHR0cHM6Ly93d3cucmRvY3VtZW50YXRpb24ub3JnL3BhY2thZ2VzL3Rlc3R3aGF0L3ZlcnNpb25zLzQuMS4xXG4jIENoZWNrIGlmIHNvbWV0aGluZyBpcyBleHBsaWNpdGx5IHR5cGVkXG5cbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJsZW5ndGgocmVnZXgxW1sxXV0pXCIsIGluY29ycmVjdF9tc2c9XCJgcmVnZXgzYCBoYXMgYW4gaW5jb3JyZWN0IG51bWJlciBvZiBtYXRjaGVzXCIpXG50ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwicmVnZXgxXCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYHJlZ2V4MWAgaXNuJ3QgcXVpdGUgcmlnaHQsIGJ1dCBpdCBoYXMgdGhlIHJpZ2h0IG51bWJlciBvZiBtYXRjaGVzLlwiKVxudGVzdF9leHByZXNzaW9uX291dHB1dChcImxlbmd0aChyZWdleDJbWzFdXSlcIiwgaW5jb3JyZWN0X21zZz1cImByZWdleDJgIGhhcyBhbiBpbmNvcnJlY3QgbnVtYmVyIG9mIG1hdGNoZXNcIilcbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJyZWdleDJcIiwgaW5jb3JyZWN0X21zZz1cIkl0IGFwcGVhcnMgeW91ciBgcmVnZXgyYCBpc24ndCBxdWl0ZSByaWdodCwgYnV0IGl0IGhhcyB0aGUgcmlnaHQgbnVtYmVyIG9mIG1hdGNoZXMuXCIpXG50ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwibGVuZ3RoKHJlZ2V4M1tbMV1dKVwiLCBpbmNvcnJlY3RfbXNnPVwiYHJlZ2V4M2AgaGFzIGFuIGluY29ycmVjdCBudW1iZXIgb2YgbWF0Y2hlcy4gIERpZCB5b3UgbWFrZSBzdXJlIHRvIGtlZXAgdGhlIHRleHQgc2hvcnQ/XCIpXG50ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwicmVnZXgzXCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYHJlZ2V4M2AgaXNuJ3QgcXVpdGUgcmlnaHQsIGJ1dCBpdCBoYXMgdGhlIHJpZ2h0IG51bWJlciBvZiBtYXRjaGVzLlwiKVxuXG4jIHRlc3Rfc3R1ZGVudF90eXBlZCgneCA8LSAyJywgbm90X3R5cGVkX21zZz0nJylcblxuIyBDaGVjayBpZiBmdW5jdGlvbiB3YXMgdXNlZCBpbiBpbnB1dCBjb2RlXG4jIHRlc3RfZnVuY3Rpb24oJ2MnLGluY29ycmVjdF9tc2c9JycpICBcblxuIyBSZXF1aXJlcyBhbiBvYmplY3QgYHhgIHRvIGhhdmUgdGhlIHNhbWUgdmFsdWUgYXMgdGhlIHNvbHV0aW9uXG4jIHRlc3Rfb2JqZWN0KFwieFwiLGluY29ycmVjdF9tc2cgPSBcIlwiLHVuZGVmaW5lZF9tc2cgPSBcIlwiKSAgXG5cbiMgUmVxdWlyZXMgYW4gb25qZWN0IHdpdGggdGhlIHNhbWUgdmFsdWUgb2YgYHhgIGluIHRoZSBzb2x1dGlvblxuIyB0ZXN0X2FuX29iamVjdChcInhcIix1bmRlZmluZWRfbXNnPVwiXCIpXG5cbiMgQ2hlY2tzIGlmIG91dHB1dCBvZiBzdHVkZW50J3MgY29kZSBjb250YWlucyBnaXZlbiBldmFsdWF0ZWQgZXhwcmVzc2lvblxuIyB0ZXN0X291dHB1dF9jb250YWlucyhcInhcIixpbmNvcnJlY3RfbXNnID0gXCJcIilcblxuIyBDaGVjayBpZiBhIHZlY3RvciBvZiBwcmVkZWZpbmVkIG9iamVjdHMgYXJlIHVuY2hhbmdlZFxuIyB0ZXN0X3ByZWRlZmluZWRfb2JqZWN0cyhjKCd4JywneScpLGluY29ycmVjdF9tc2c9XCJEb24ndCBvbnZlcndyaXRlIHRoZSBwcmVkZWZpbmVkIHZhcmlhYmxlc1wiKVxuXG4jIENoZWNrcyBmb3IgYSByZWdleCBwYXR0ZXJuIGluIHRyaGUgb3V0cHV0XG4jIHRlc3Rfb3V0cHV0X3JlZ2V4KHBhdHRlcm4sZml4ZWQ9RiwgdGltZXM9MSwgaW5jb3JyZWN0X21zZz0nJylcblxuIyBDYW4gY2hlY2sgYW4gYXJiaXRyYXJ5IGV4cHJlc3Npb24gYWNyb3NzIGJvdGggc29sdXRpb24gYW5kIHN0dWRlbnQgY29kZVxuI3Rlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJ0eXBlb2YoY29tcGFueV9uYW1lKVwiLCBpbmNvcnJlY3RfbXNnPVwiRGlkIHlvdSBzdG9yZSB0ZXh0dWFsIGRhdGEgaW4gYGNvbXBhbnlfbmFtZWA/XCIpXG5cbnRlc3RfZXJyb3IoKVxuc3VjY2Vzc19tc2coXCJBd2Vzb21lIVwiKVxuXG4jIE90aGVyIGZ1bmN0aW9ucyB0byBub3RlOlxuIyAgICAgLSB0ZXN0X29yKGEsYikgLS0gY2hlY2tzIGlmIGVpdGhlciB0ZXN0IGEgb3IgdGVzdCBiIHBhc3NcbiMgICAgIC0gdGVzdF9nZ3Bsb3QoKSAtLSBjYW4gY2hlY2sgaWYgcGxvdHMgYXJlIGNvcnJlY3RcbiMgICAgIC0gdGVzdF9mdW5jdGlvbigpIC0tIGNhbiBhbHNvIGNoZWNrIGluY2x1ZGVkIHBhcmFtZXRlcnNcbiMgICAgIC0gdGVzdF9sb29wKCkgLS0gY2hlY2tpbmcgZm9yIGFuZCB3aGlsZSBsb29wc1xuIyAgICAgLSB0ZXN0X2xpYnJhcnlfZnVuY3Rpb24oJ3BhY2thZ2UnLCBub3RfY2FsbGVkX21zZz0nJyxpbmNvcnJlY3RfbXNnPScnKVxuIyAgICAgLSB0ZXN0X2lmX2Vsc2UoKSAtLSBjaGVja2luZyBpZiBzdGF0ZW1lbnRzXG4jICAgICAtIHRlc3RfZXhwcmVzc2lvbl9lcnJvcigpIC0tIGNhbiBjaGVjayBpZiBmdW5jdGlvbnMgYXJlIHByb3Blcmx5IGRlZmluZWRcbiMgICAgIC0gdGVzdF9vcGVyYXRvcignb3BlcmF0b3InLCksIG5vdF9jYWxsZWRfbXNnPScnLGluY29ycmVjdF9tc2c9JycpXG4jICAgICAtIHRlc3RfZnVuY3Rpb25fZGVmaW5pdGlvbigpIC0tIHJpZ29yb3VzbHkgY2hlY2sgZGVmaW5lZCBmdW5jdGlvblxuIyAgICAgLSB0ZXN0X2RhdGFfZnJhbWUoKSAtLSBjaGVjayBpZiBkYXRhZnJhbWUgW2NvbHVtbnNdIGFyZSBlcXVpdmFsZW50XG4jICAgICAtIHRlc3RfZnVuY3Rpb25fcmVzdWx0LCB0ZXN0X2V4cHJlc3Npb25fcmVzdWx0In0=

Calculating quantities with text

Note: Due to missing packages in DataCamp light, namely quanteda, textdata, and tidytext, I have provided sample code that you can run on your own computer in RStudio. Make sure to run install.packages("quanteda"), install.packages("textdata"), and install.packages("tidytext") to install those packages if you don’t have them.

Each of the three exercises below can be run as standalone scripts, as they contain all needed imports within their code blocks

Code for this section can be downloaded as an Rmd file here. The output of this code can be viewed in this Rmarkdown notebook.

Exercise 6: Readability with Quanteda

How does the readability of JPMorgan’s annual report compare to the Citigroup annual report from class?

eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiIjIGxvYWQgaW4gcmVhZHIgKG9yIHRpZHl2ZXJzZSkgdG8gZ2V0IHJlYWRfZmlsZSgpIGZ1bmN0aW9uXG5saWJyYXJ5KHJlYWRyKVxuXG4jIExvYWQgaW4gYWxsIG9mIEpQTSdzIDIwMTQgYW5udWFsIHJlcG9ydFxuZG9jIDwtIHJlYWRfZmlsZShcImh0dHBzOi8vcm1jLmxpbmsvU2xpZGVzL2FjY3Q0MjB2My9TZXNzaW9uXzcvMDAwMDAxOTYxNy0xNC0wMDAyODkudHh0XCIpXG5cbiMgTG9hZCBpbiBxdWFudGVkYVxubGlicmFyeShxdWFudGVkYSlcblxuIyBDYWxjdWxhdGUgdGhlIHRocmVlIHJlYWRhYmlsaXR5IG1lYXN1cmVzXG50ZXh0c3RhdF9yZWFkYWJpbGl0eShkb2MsIFwiRmxlc2NoLktpbmNhaWRcIilcbnRleHRzdGF0X3JlYWRhYmlsaXR5KGRvYywgXCJGT0dcIilcbnRleHRzdGF0X3JlYWRhYmlsaXR5KGRvYywgXCJDb2xlbWFuLkxpYXVcIilcblxuI0VORCJ9

Exercise 7: Readability with Quanteda

How does the sentiment of JPMorgan’s annual report compare to the Citigroup annual report from class?

eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiIjIGxvYWQgaW4gcmVhZHIgKG9yIHRpZHl2ZXJzZSkgdG8gZ2V0IHJlYWRfZmlsZSgpIGZ1bmN0aW9uXG5saWJyYXJ5KHJlYWRyKVxuXG4jIExvYWQgaW4gYWxsIG9mIEpQTSdzIDIwMTQgYW5udWFsIHJlcG9ydFxuZG9jIDwtIHJlYWRfZmlsZShcImh0dHBzOi8vcm1jLmxpbmsvU2xpZGVzL2FjY3Q0MjB2My9TZXNzaW9uXzcvMDAwMDAxOTYxNy0xNC0wMDAyODkudHh0XCIpXG5cbiMgTG9hZCBpbiB0aWR5dGV4dFxubGlicmFyeSh0aWR5dGV4dClcblxuIyBMb2FkIHNvbWUgY29tcG9uZW50cyBvZiB0aWR5dmVyc2VcbmxpYnJhcnkoZHBseXIpICAjIGZvciB0aGUgdXN1YWwgY29tbWFuZHNcbmxpYnJhcnkodGlkeXIpICAjIGZvciBzcHJlYWRcblxuIyBjb252ZXJ0IGRvY3VtZW50IHRvIHRpZHkgZm9ybWF0XG5kZl9kb2MgPC0gZGF0YS5mcmFtZShJRD1jKFwiMDAwMDAxOTYxNy0xNC0wMDAyODlcIiksIHRleHQ9Yyhkb2MpLFxuICAgICAgICAgICAgICAgICAgICAgc3RyaW5nc0FzRmFjdG9ycyA9IEYpICU+JVxuICB1bm5lc3RfdG9rZW5zKHdvcmQsIHRleHQpXG5cbiMgQ2FsY3VsYXRlIHRlcm0gZnJlcXVlbmN5XG50ZXJtcyA8LSBkZl9kb2MgJT4lXG4gIGNvdW50KElELCB3b3JkLCBzb3J0PVRSVUUpICU+JVxuICB1bmdyb3VwKClcbnRvdGFsX3Rlcm1zIDwtIHRlcm1zICU+JSBcbiAgZ3JvdXBfYnkoSUQpICU+JSBcbiAgc3VtbWFyaXplKHRvdGFsID0gc3VtKG4pKVxudGYgPC0gbGVmdF9qb2luKHRlcm1zLCB0b3RhbF90ZXJtcykgJT4lIG11dGF0ZSh0Zj1uL3RvdGFsKVxuXG4jIEdldCB0aGUgTG91Z2hyYW4gTWNEb25hbGQgc2VudGltZW50IGRpY3Rpb25hcnlcbnNlbnRpbWVudCA8LSBnZXRfc2VudGltZW50cyhcImxvdWdocmFuXCIpXG5cbiMgTWVyZ2UgaW4gc2VudGltZW50XG50Zl9zZW50IDwtIHRmICU+JSBsZWZ0X2pvaW4oc2VudGltZW50KVxuXG4jIENhbGN1bGF0ZSB0aGUgdGhyZWUgcmVhZGFiaWxpdHkgbWVhc3VyZXNcbnRmX3NlbnQgJT4lXG4gIHNwcmVhZChzZW50aW1lbnQsIHRmLCBmaWxsPTApICU+JVxuICBzZWxlY3QoY29uc3RyYWluaW5nLCBsaXRpZ2lvdXMsIG5lZ2F0aXZlLCBwb3NpdGl2ZSwgc3VwZXJmbHVvdXMsIHVuY2VydGFpbnR5KSAlPiVcbiAgY29sU3VtcygpXG5cbiNFTkQifQ==

Exercise 8: Make a word cloud after removing stopwords

eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiIjIGxvYWQgaW4gcmVhZHIgKG9yIHRpZHl2ZXJzZSkgdG8gZ2V0IHJlYWRfZmlsZSgpIGZ1bmN0aW9uXG5saWJyYXJ5KHJlYWRyKVxuXG4jIExvYWQgaW4gYWxsIG9mIEpQTSdzIDIwMTQgYW5udWFsIHJlcG9ydFxuZG9jIDwtIHJlYWRfZmlsZShcImh0dHBzOi8vcm1jLmxpbmsvU2xpZGVzL2FjY3Q0MjB2My9TZXNzaW9uXzcvMDAwMDAxOTYxNy0xNC0wMDAyODkudHh0XCIpXG5cbiMgTG9hZCBpbiBxdWFudGVkYSBhbmQgdGlkeXRleHRcbmxpYnJhcnkocXVhbnRlZGEpXG5saWJyYXJ5KHRpZHl0ZXh0KVxuXG4jIExvYWQgaW4gc29tZSBvZiB0aWR5dmVyc2VcbmxpYnJhcnkoZHBseXIpXG5cbiMgY29udmVydCBkb2N1bWVudCB0byB0aWR5IGZvcm1hdFxuZGZfZG9jIDwtIGRhdGEuZnJhbWUoSUQ9YyhcIjAwMDAwMTk2MTctMTQtMDAwMjg5XCIpLCB0ZXh0PWMoZG9jKSxcbiAgICAgICAgICAgICAgICAgICAgIHN0cmluZ3NBc0ZhY3RvcnMgPSBGKSAlPiVcbiAgdW5uZXN0X3Rva2Vucyh3b3JkLCB0ZXh0KVxuXG4jIFB1bGwgYSBsaXN0IG9mIHN0b3B3b3Jkc1xuc3RvcHdvcmRzIDwtIHN0b3B3b3Jkczo6c3RvcHdvcmRzKHNvdXJjZT1cInNtYXJ0XCIpXG5cbiMgUmVtb3ZlIHN0b3B3b3Jkc1xuZGZfZG9jX3N0b3AgPC0gZGZfZG9jICU+JVxuICBhbnRpX2pvaW4oZGF0YS5mcmFtZSh3b3JkPXN0b3B3b3Jkcywgc3RyaW5nc0FzRmFjdG9ycz1GKSlcblxuIyBDYWxjdWxhdGUgdGVybSBmcmVxdWVuY3lcbnRlcm1zIDwtIGRmX2RvY19zdG9wICU+JVxuICBjb3VudChJRCwgd29yZCwgc29ydD1UUlVFKSAlPiVcbiAgdW5ncm91cCgpXG50b3RhbF90ZXJtcyA8LSB0ZXJtcyAlPiUgXG4gIGdyb3VwX2J5KElEKSAlPiUgXG4gIHN1bW1hcml6ZSh0b3RhbCA9IHN1bShuKSlcbnRmIDwtIGxlZnRfam9pbih0ZXJtcywgdG90YWxfdGVybXMpICU+JSBtdXRhdGUodGY9bi90b3RhbClcblxuIyBCdWlsZCBhIGNvcnB1cyBvYmplY3QgZm9yIHF1YW50ZWRhXG5jb3JwIDwtIGNhc3RfZGZtKHRmLCBJRCwgd29yZCwgbilcblxuIyBQbG90IGEgd29yZCBjbG91ZCAtLSBJZiB5b3UgZG9uJ3QgaGF2ZSBSQ29sb3JCcmV3ZXIgaW5zdGFsbGVkLCB5b3UgY2FuXG4jIHJlbW92ZSB0aGUgYGNvbG9yPWAgb3B0aW9uLlxudGV4dHBsb3Rfd29yZGNsb3VkKGRmbShjb3JwKSwgY29sb3IgPSBSQ29sb3JCcmV3ZXI6OmJyZXdlci5wYWwoOSwgXCJTZXQxXCIpKVxuXG4jRU5EIn0=