Working with text data

To run your code, click run. It will let you know if your code is correct or not, and will offer hints on what to fix if needed. If you are stuck, you can see the solution by clicking Solution.

Exercise 1: Subsetting data

eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6ImxpYnJhcnkoZHBseXIpXG5kb2MgPC0gXCJPbiBKdW5lIDUsIDIwMTMsIHRoZSBTRUMgYXBwcm92ZWQgdGhlIHB1YmxpY2F0aW9uIG9mIHByb3Bvc2VkIHN0cnVjdHVyYWwgcmVmb3JtcyBvZiBtb25leSBtYXJrZXQgZnVuZHMuIFRoZSBwcm9wb3NhbCBjb25zaWRlcmVkIHR3byByZWZvcm0gYWx0ZXJuYXRpdmVzIHRoYXQgY291bGQgYmUgYWRvcHRlZCBlaXRoZXIgYWxvbmUgb3IgaW4gY29tYmluYXRpb246IChpKSByZXF1aXJpbmcgcHJpbWUgYW5kIHRheC1leGVtcHQgaW5zdGl0dXRpb25hbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gZmxvYXQgdGhlaXIgbmV0IGFzc2V0IHZhbHVlcyBvciAoaWkpIHJlcXVpcmluZyBhbGwgbm9uLWdvdmVybm1lbnRhbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gaW1wb3NlIGxpcXVpZGl0eSBmZWVzIG9mIHVwIHRvIDIlIGFuZCB0byBoYXZlIHRoZSBvcHRpb24gdG8gdGVtcG9yYXJpbHkgc3VzcGVuZCByZWRlbXB0aW9ucyAob3IgZ2F0ZSB0aGUgbW9uZXkgbWFya2V0IGZ1bmQpIHVwb24gdGhlIG9jY3VycmVuY2Ugb2Ygc3BlY2lmaWVkIGV2ZW50cyBpbmRpY2F0aW5nIHRoYXQgdGhlIGZ1bmQgbWF5IGJlIHVuZGVyIHN0cmVzcy4gSXQgaXMgY3VycmVudGx5IGFudGljaXBhdGVkIHRoYXQgdGhlIFNFQyB3aWxsIGFkb3B0IGZpbmFsIHN0cnVjdHVyYWwgcmVmb3JtcyBpbiAyMDE0LiBUaGUgRmluYW5jaWFsIFN0YWJpbGl0eSBCb2FyZCAodGhlIEZTQikgaGFzIGVuZG9yc2VkIGFuZCBwdWJsaXNoZWQgZm9yIHB1YmxpYyBjb25zdWx0YXRpb24gMTUgcG9saWN5IHJlY29tbWVuZGF0aW9ucyBwcm9wb3NlZCBieSB0aGUgSW50ZXJuYXRpb25hbCBPcmdhbml6YXRpb24gb2YgU2VjdXJpdGllcyBDb21taXNzaW9ucywgaW5jbHVkaW5nIHJlcXVpcmluZyBtb25leSBtYXJrZXQgZnVuZHMgdG8gYWRvcHQgYSBmbG9hdGluZyBuZXQgYXNzZXQgdmFsdWUuIEluIGFkZGl0aW9uLCBpbiBTZXB0ZW1iZXIgMjAxMyB0aGUgRXVyb3BlYW4gQ29tbWlzc2lvbiAodGhlIEVDKSByZWxlYXNlZCBhIHByb3Bvc2FsIGZvciBhIG5ldyByZWd1bGF0aW9uIG9uIG1vbmV5IG1hcmtldCBmdW5kcyBpbiB0aGUgRVUuIFRoZSBFQyBwcm9wb3NlZCB0d28gb3B0aW9ucyBmb3Igc3RhYmxlIG5ldCBhc3NldCB2YWx1ZSBtb25leSBtYXJrZXQgZnVuZHM6IGVpdGhlciAoaSkgbWFpbnRhaW4gYSBjYXBpdGFsIGJ1ZmZlciBvZiBhdCBsZWFzdCB0aHJlZSBwZXJjZW50IG9mIGFzc2V0cyB1bmRlciBtYW5hZ2VtZW50IG9yIChpaSkgZmxvYXQgdGhlIG5ldCBhc3NldCB2YWx1ZSBvZiB0aGUgbW9uZXkgbWFya2V0IGZ1bmQuIFRoZSBFQyBwcm9wb3NhbCBpcyBjdXJyZW50bHkgYmVpbmcgcmV2aWV3ZWQgYnkgdGhlIEV1cm9wZWFuIFBhcmxpYW1lbnQgYW5kIHRoZSBDb3VuY2lsIG9mIE1lbWJlciBTdGF0ZXMgYXMgY28tbGVnaXNsYXRvcnMsIGFuZCBpcyBleHBlY3RlZCB0byBiZSBhcHByb3ZlZCBpbiAyMDE0LiBGb3IgZnVydGhlciBpbmZvcm1hdGlvbiBvbiBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMsIHNlZSBTaWduaWZpY2FudCBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMgb24gcGFnZXMgODkgLiBiYW5rIHN1YnNpZGlhcmllcyBwYXkgYW5udWFsbHkgdG8gdGhlIEZESUMuIEZvciBtb3JlIGluZm9ybWF0aW9uLCBzZWUgRGVwb3NpdCBpbnN1cmFuY2Ugb24gcGFnZSA2IC4gXCIiLCJzYW1wbGUiOiIjIFRoZSB2YXJpYWJsZSBiZWxvdyBjb250YWlucyBhIHBhcmFncmFwaCBmcm9tIEpQTSdzIDIwMTQgYW5udWFsIHJlcG9ydFxuY2F0KGRvYylcblxuIyBMb2FkIGluIHN0cmluZ3IgZm9yIGVhc3kgc3RyaW5nIGZ1bmN0aW9uc1xubGlicmFyeShzdHJpbmdyKVxuXG4jIEV4dHJhY3QgdGhlIGZpcnN0IDEwMCBjaGFyYWN0ZXJzIGFuZCBzdG9yZSB0aGVtIGluIGEgdmFyaWFibGUgbmFtZSBmaXJzdFxuZmlyc3QgPC0gXG5cbiMgRXh0cmFjdCB0aGUgbWlkZGxlIDEwMCBjaGFyYWN0ZXJzICg3OTMgdG8gODkyLCBpbmNsdXNpdmUpIGFuZCBzdG9yZSBpbiBhXG4jIFZhcmlhYmxlIG5hbWVkIG1pZGRsZVxubWlkZGxlIDwtIFxuXG4jIEV4dHJhY3QgdGhlIGxhc3QgMTAwIGNoYXJhY3RlcnMgYW5kIHN0b3JlIHRoZW0gaW4gYSB2YXJpYWJsZSBuYW1lZCBsYXN0XG5sYXN0IDwtIFxuXG4jIHByaW50IG91dCB0aGUgc3Vic2V0c1xucHJpbnQocGFzdGUoZmlyc3QsIG1pZGRsZSwgbGFzdCwgc2VwPScgLS0tICcpKVxuI0VORCIsInNvbHV0aW9uIjoiIyBUaGUgdmFyaWFibGUgYmVsb3cgY29udGFpbnMgYSBwYXJhZ3JhcGggZnJvbSBKUE0ncyAyMDE0IGFubnVhbCByZXBvcnRcbmNhdChkb2MpXG5cbiMgTG9hZCBpbiBzdHJpbmdyIGZvciBlYXN5IHN0cmluZyBmdW5jdGlvbnNcbmxpYnJhcnkoc3RyaW5ncilcblxuIyBFeHRyYWN0IHRoZSBmaXJzdCAxMDAgY2hhcmFjdGVycyBhbmQgc3RvcmUgdGhlbSBpbiBhIHZhcmlhYmxlIG5hbWUgZmlyc3RcbmZpcnN0IDwtIHN0cl9zdWIoZG9jLCAxLCAxMDApXG5cbiMgRXh0cmFjdCB0aGUgbWlkZGxlIDEwMCBjaGFyYWN0ZXJzICg3OTMgdG8gODkyLCBpbmNsdXNpdmUpIGFuZCBzdG9yZSBpbiBhXG4jIFZhcmlhYmxlIG5hbWVkIG1pZGRsZVxubWlkZGxlIDwtIHN0cl9zdWIoZG9jLCA3OTMsIDg5MilcblxuIyBFeHRyYWN0IHRoZSBsYXN0IDEwMCBjaGFyYWN0ZXJzIGFuZCBzdG9yZSB0aGVtIGluIGEgdmFyaWFibGUgbmFtZWQgbGFzdFxubGFzdCA8LSBzdHJfc3ViKGRvYywgc3RyX2xlbmd0aChkb2MpLTk5LCBzdHJfbGVuZ3RoKGRvYykpXG5cbiMgcHJpbnQgb3V0IHRoZSBzdWJzZXRzXG5wcmludChwYXN0ZShmaXJzdCwgbWlkZGxlLCBsYXN0LCBzZXA9JyAtLS0gJykpXG4jRU5EIiwic2N0IjoiIyBUZW1wbGF0ZSBiYXNlZCBvbiBodHRwczovL3d3dy5yZG9jdW1lbnRhdGlvbi5vcmcvcGFja2FnZXMvdGVzdHdoYXQvdmVyc2lvbnMvNC4xLjFcbiMgQ2hlY2sgaWYgc29tZXRoaW5nIGlzIGV4cGxpY2l0bHkgdHlwZWRcblxudGVzdF9leHByZXNzaW9uX291dHB1dChcInN0cl9sZW5ndGgoZmlyc3QpXCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYGZpcnN0YCBpc24ndCB0aGUgcmlnaHQgbGVuZ3RoLlwiKVxudGVzdF9leHByZXNzaW9uX291dHB1dChcImZpcnN0XCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYGZpcnN0YCBpc24ndCBxdWl0ZSByaWdodC5cIilcbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJzdHJfbGVuZ3RoKG1pZGRsZSlcIiwgaW5jb3JyZWN0X21zZz1cIkl0IGFwcGVhcnMgeW91ciBgbWlkZGxlYCBpc24ndCB0aGUgcmlnaHQgbGVuZ3RoLlwiKVxudGVzdF9leHByZXNzaW9uX291dHB1dChcIm1pZGRsZVwiLCBpbmNvcnJlY3RfbXNnPVwiSXQgYXBwZWFycyB5b3VyIGBtaWRkbGVgIGlzbid0IHF1aXRlIHJpZ2h0LlwiKVxudGVzdF9leHByZXNzaW9uX291dHB1dChcInN0cl9sZW5ndGgobGFzdClcIiwgaW5jb3JyZWN0X21zZz1cIkl0IGFwcGVhcnMgeW91ciBgbGFzdGAgaXNuJ3QgdGhlIHJpZ2h0IGxlbmd0aC5cIilcbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJsYXN0XCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYGxhc3RgIGlzbid0IHF1aXRlIHJpZ2h0LlwiKVxuXG4jIHRlc3Rfc3R1ZGVudF90eXBlZCgneCA8LSAyJywgbm90X3R5cGVkX21zZz0nJylcblxuIyBDaGVjayBpZiBmdW5jdGlvbiB3YXMgdXNlZCBpbiBpbnB1dCBjb2RlXG4jIHRlc3RfZnVuY3Rpb24oJ2MnLGluY29ycmVjdF9tc2c9JycpICBcblxuIyBSZXF1aXJlcyBhbiBvYmplY3QgYHhgIHRvIGhhdmUgdGhlIHNhbWUgdmFsdWUgYXMgdGhlIHNvbHV0aW9uXG4jIHRlc3Rfb2JqZWN0KFwieFwiLGluY29ycmVjdF9tc2cgPSBcIlwiLHVuZGVmaW5lZF9tc2cgPSBcIlwiKSAgXG5cbiMgUmVxdWlyZXMgYW4gb25qZWN0IHdpdGggdGhlIHNhbWUgdmFsdWUgb2YgYHhgIGluIHRoZSBzb2x1dGlvblxuIyB0ZXN0X2FuX29iamVjdChcInhcIix1bmRlZmluZWRfbXNnPVwiXCIpXG5cbiMgQ2hlY2tzIGlmIG91dHB1dCBvZiBzdHVkZW50J3MgY29kZSBjb250YWlucyBnaXZlbiBldmFsdWF0ZWQgZXhwcmVzc2lvblxuIyB0ZXN0X291dHB1dF9jb250YWlucyhcInhcIixpbmNvcnJlY3RfbXNnID0gXCJcIilcblxuIyBDaGVjayBpZiBhIHZlY3RvciBvZiBwcmVkZWZpbmVkIG9iamVjdHMgYXJlIHVuY2hhbmdlZFxuIyB0ZXN0X3ByZWRlZmluZWRfb2JqZWN0cyhjKCd4JywneScpLGluY29ycmVjdF9tc2c9XCJEb24ndCBvbnZlcndyaXRlIHRoZSBwcmVkZWZpbmVkIHZhcmlhYmxlc1wiKVxuXG4jIENoZWNrcyBmb3IgYSByZWdleCBwYXR0ZXJuIGluIHRyaGUgb3V0cHV0XG4jIHRlc3Rfb3V0cHV0X3JlZ2V4KHBhdHRlcm4sZml4ZWQ9RiwgdGltZXM9MSwgaW5jb3JyZWN0X21zZz0nJylcblxuIyBDYW4gY2hlY2sgYW4gYXJiaXRyYXJ5IGV4cHJlc3Npb24gYWNyb3NzIGJvdGggc29sdXRpb24gYW5kIHN0dWRlbnQgY29kZVxuI3Rlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJ0eXBlb2YoY29tcGFueV9uYW1lKVwiLCBpbmNvcnJlY3RfbXNnPVwiRGlkIHlvdSBzdG9yZSB0ZXh0dWFsIGRhdGEgaW4gYGNvbXBhbnlfbmFtZWA/XCIpXG5cbnRlc3RfZXJyb3IoKVxuc3VjY2Vzc19tc2coXCJBd2Vzb21lIVwiKVxuXG4jIE90aGVyIGZ1bmN0aW9ucyB0byBub3RlOlxuIyAgICAgLSB0ZXN0X29yKGEsYikgLS0gY2hlY2tzIGlmIGVpdGhlciB0ZXN0IGEgb3IgdGVzdCBiIHBhc3NcbiMgICAgIC0gdGVzdF9nZ3Bsb3QoKSAtLSBjYW4gY2hlY2sgaWYgcGxvdHMgYXJlIGNvcnJlY3RcbiMgICAgIC0gdGVzdF9mdW5jdGlvbigpIC0tIGNhbiBhbHNvIGNoZWNrIGluY2x1ZGVkIHBhcmFtZXRlcnNcbiMgICAgIC0gdGVzdF9sb29wKCkgLS0gY2hlY2tpbmcgZm9yIGFuZCB3aGlsZSBsb29wc1xuIyAgICAgLSB0ZXN0X2xpYnJhcnlfZnVuY3Rpb24oJ3BhY2thZ2UnLCBub3RfY2FsbGVkX21zZz0nJyxpbmNvcnJlY3RfbXNnPScnKVxuIyAgICAgLSB0ZXN0X2lmX2Vsc2UoKSAtLSBjaGVja2luZyBpZiBzdGF0ZW1lbnRzXG4jICAgICAtIHRlc3RfZXhwcmVzc2lvbl9lcnJvcigpIC0tIGNhbiBjaGVjayBpZiBmdW5jdGlvbnMgYXJlIHByb3Blcmx5IGRlZmluZWRcbiMgICAgIC0gdGVzdF9vcGVyYXRvcignb3BlcmF0b3InLCksIG5vdF9jYWxsZWRfbXNnPScnLGluY29ycmVjdF9tc2c9JycpXG4jICAgICAtIHRlc3RfZnVuY3Rpb25fZGVmaW5pdGlvbigpIC0tIHJpZ29yb3VzbHkgY2hlY2sgZGVmaW5lZCBmdW5jdGlvblxuIyAgICAgLSB0ZXN0X2RhdGFfZnJhbWUoKSAtLSBjaGVjayBpZiBkYXRhZnJhbWUgW2NvbHVtbnNdIGFyZSBlcXVpdmFsZW50XG4jICAgICAtIHRlc3RfZnVuY3Rpb25fcmVzdWx0LCB0ZXN0X2V4cHJlc3Npb25fcmVzdWx0In0=

Exercise 2: Changing case

eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6ImxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KHN0cmluZ3IpXG5kb2MgPC0gXCJPbiBKdW5lIDUsIDIwMTMsIHRoZSBTRUMgYXBwcm92ZWQgdGhlIHB1YmxpY2F0aW9uIG9mIHByb3Bvc2VkIHN0cnVjdHVyYWwgcmVmb3JtcyBvZiBtb25leSBtYXJrZXQgZnVuZHMuIFRoZSBwcm9wb3NhbCBjb25zaWRlcmVkIHR3byByZWZvcm0gYWx0ZXJuYXRpdmVzIHRoYXQgY291bGQgYmUgYWRvcHRlZCBlaXRoZXIgYWxvbmUgb3IgaW4gY29tYmluYXRpb246IChpKSByZXF1aXJpbmcgcHJpbWUgYW5kIHRheC1leGVtcHQgaW5zdGl0dXRpb25hbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gZmxvYXQgdGhlaXIgbmV0IGFzc2V0IHZhbHVlcyBvciAoaWkpIHJlcXVpcmluZyBhbGwgbm9uLWdvdmVybm1lbnRhbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gaW1wb3NlIGxpcXVpZGl0eSBmZWVzIG9mIHVwIHRvIDIlIGFuZCB0byBoYXZlIHRoZSBvcHRpb24gdG8gdGVtcG9yYXJpbHkgc3VzcGVuZCByZWRlbXB0aW9ucyAob3IgZ2F0ZSB0aGUgbW9uZXkgbWFya2V0IGZ1bmQpIHVwb24gdGhlIG9jY3VycmVuY2Ugb2Ygc3BlY2lmaWVkIGV2ZW50cyBpbmRpY2F0aW5nIHRoYXQgdGhlIGZ1bmQgbWF5IGJlIHVuZGVyIHN0cmVzcy4gSXQgaXMgY3VycmVudGx5IGFudGljaXBhdGVkIHRoYXQgdGhlIFNFQyB3aWxsIGFkb3B0IGZpbmFsIHN0cnVjdHVyYWwgcmVmb3JtcyBpbiAyMDE0LiBUaGUgRmluYW5jaWFsIFN0YWJpbGl0eSBCb2FyZCAodGhlIEZTQikgaGFzIGVuZG9yc2VkIGFuZCBwdWJsaXNoZWQgZm9yIHB1YmxpYyBjb25zdWx0YXRpb24gMTUgcG9saWN5IHJlY29tbWVuZGF0aW9ucyBwcm9wb3NlZCBieSB0aGUgSW50ZXJuYXRpb25hbCBPcmdhbml6YXRpb24gb2YgU2VjdXJpdGllcyBDb21taXNzaW9ucywgaW5jbHVkaW5nIHJlcXVpcmluZyBtb25leSBtYXJrZXQgZnVuZHMgdG8gYWRvcHQgYSBmbG9hdGluZyBuZXQgYXNzZXQgdmFsdWUuIEluIGFkZGl0aW9uLCBpbiBTZXB0ZW1iZXIgMjAxMyB0aGUgRXVyb3BlYW4gQ29tbWlzc2lvbiAodGhlIEVDKSByZWxlYXNlZCBhIHByb3Bvc2FsIGZvciBhIG5ldyByZWd1bGF0aW9uIG9uIG1vbmV5IG1hcmtldCBmdW5kcyBpbiB0aGUgRVUuIFRoZSBFQyBwcm9wb3NlZCB0d28gb3B0aW9ucyBmb3Igc3RhYmxlIG5ldCBhc3NldCB2YWx1ZSBtb25leSBtYXJrZXQgZnVuZHM6IGVpdGhlciAoaSkgbWFpbnRhaW4gYSBjYXBpdGFsIGJ1ZmZlciBvZiBhdCBsZWFzdCB0aHJlZSBwZXJjZW50IG9mIGFzc2V0cyB1bmRlciBtYW5hZ2VtZW50IG9yIChpaSkgZmxvYXQgdGhlIG5ldCBhc3NldCB2YWx1ZSBvZiB0aGUgbW9uZXkgbWFya2V0IGZ1bmQuIFRoZSBFQyBwcm9wb3NhbCBpcyBjdXJyZW50bHkgYmVpbmcgcmV2aWV3ZWQgYnkgdGhlIEV1cm9wZWFuIFBhcmxpYW1lbnQgYW5kIHRoZSBDb3VuY2lsIG9mIE1lbWJlciBTdGF0ZXMgYXMgY28tbGVnaXNsYXRvcnMsIGFuZCBpcyBleHBlY3RlZCB0byBiZSBhcHByb3ZlZCBpbiAyMDE0LiBGb3IgZnVydGhlciBpbmZvcm1hdGlvbiBvbiBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMsIHNlZSBTaWduaWZpY2FudCBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMgb24gcGFnZXMgODkgLiBiYW5rIHN1YnNpZGlhcmllcyBwYXkgYW5udWFsbHkgdG8gdGhlIEZESUMuIEZvciBtb3JlIGluZm9ybWF0aW9uLCBzZWUgRGVwb3NpdCBpbnN1cmFuY2Ugb24gcGFnZSA2IC4gXCJcbmZpcnN0IDwtIHN0cl9zdWIoZG9jLCAxLCAxMDApIiwic2FtcGxlIjoiIyBUaGUgdmFyaWFibGUgZG9jIGNvbnRhaW5zIGEgcGFyYWdyYXBoIGZyb20gSlBNJ3MgMjAxNCBhbm51YWwgcmVwb3J0XG4jIFRoZSB2YXJpYWJsZSBmaXJzdCBmcm9tIHRoZSBwcmV2aW91cyBleGVyY2lzZSBpcyBsb2FkZWRcbmNhdChmaXJzdClcblxuIyBDcmVhdGUgYSB2ZXJzaW9uIG9mIGZpcnN0IHRoYXQgaXMgYWxsIGxvd2VyY2FzZVxubG93ZXIgPC0gXG5cbiMgQ3JlYXRlIGEgdmVyc2lvbiBvZiBmaXJzdCB0aGF0IGlzIGFsbCBVUFBFUkNBU0VcbnVwcGVyIDwtIFxuXG4jIENyZWF0ZSBhIHZlcnNpb24gb2YgZmlyc3QgdGhhdCBpcyBhbGwgVGl0bGVjYXNlXG50aXRsZSA8LSBcblxuIyBwcmludCBvdXQgdGhlIGRpZmZlcmVudCB2ZXJzaW9uc1xucHJpbnQocGFzdGUobG93ZXIsIHVwcGVyLCB0aXRsZSwgc2VwPScgLS0tICcpKVxuICBcbiNFTkQiLCJzb2x1dGlvbiI6IiMgVGhlIHZhcmlhYmxlIGRvYyBjb250YWlucyBhIHBhcmFncmFwaCBmcm9tIEpQTSdzIDIwMTQgYW5udWFsIHJlcG9ydFxuIyBUaGUgdmFyaWFibGUgZmlyc3QgZnJvbSB0aGUgcHJldmlvdXMgZXhlcmNpc2UgaXMgbG9hZGVkXG5jYXQoZmlyc3QpXG5cbiMgQ3JlYXRlIGEgdmVyc2lvbiBvZiBmaXJzdCB0aGF0IGlzIGFsbCBsb3dlcmNhc2Vcbmxvd2VyIDwtIHN0cl90b19sb3dlcihmaXJzdClcblxuIyBDcmVhdGUgYSB2ZXJzaW9uIG9mIGZpcnN0IHRoYXQgaXMgYWxsIFVQUEVSQ0FTRVxudXBwZXIgPC0gc3RyX3RvX3VwcGVyKGZpcnN0KVxuXG4jIENyZWF0ZSBhIHZlcnNpb24gb2YgZmlyc3QgdGhhdCBpcyBhbGwgVGl0bGVjYXNlXG50aXRsZSA8LSBzdHJfdG9fdGl0bGUoZmlyc3QpXG5cbiMgcHJpbnQgb3V0IHRoZSBkaWZmZXJlbnQgdmVyc2lvbnNcbnByaW50KHBhc3RlKGxvd2VyLCB1cHBlciwgdGl0bGUsIHNlcD0nIC0tLSAnKSlcbiAgXG4jRU5EIiwic2N0IjoiIyBUZW1wbGF0ZSBiYXNlZCBvbiBodHRwczovL3d3dy5yZG9jdW1lbnRhdGlvbi5vcmcvcGFja2FnZXMvdGVzdHdoYXQvdmVyc2lvbnMvNC4xLjFcbiMgQ2hlY2sgaWYgc29tZXRoaW5nIGlzIGV4cGxpY2l0bHkgdHlwZWRcblxudGVzdF9leHByZXNzaW9uX291dHB1dChcImxvd2VyXCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYGxvd2VyYCBpc24ndCBxdWl0ZSByaWdodC5cIilcbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJ1cHBlclwiLCBpbmNvcnJlY3RfbXNnPVwiSXQgYXBwZWFycyB5b3VyIGB1cHBlcmAgaXNuJ3QgcXVpdGUgcmlnaHQuXCIpXG50ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwidGl0bGVcIiwgaW5jb3JyZWN0X21zZz1cIkl0IGFwcGVhcnMgeW91ciBgdGl0bGVgIGlzbid0IHF1aXRlIHJpZ2h0LlwiKVxuXG4jIHRlc3Rfc3R1ZGVudF90eXBlZCgneCA8LSAyJywgbm90X3R5cGVkX21zZz0nJylcblxuIyBDaGVjayBpZiBmdW5jdGlvbiB3YXMgdXNlZCBpbiBpbnB1dCBjb2RlXG4jIHRlc3RfZnVuY3Rpb24oJ2MnLGluY29ycmVjdF9tc2c9JycpICBcblxuIyBSZXF1aXJlcyBhbiBvYmplY3QgYHhgIHRvIGhhdmUgdGhlIHNhbWUgdmFsdWUgYXMgdGhlIHNvbHV0aW9uXG4jIHRlc3Rfb2JqZWN0KFwieFwiLGluY29ycmVjdF9tc2cgPSBcIlwiLHVuZGVmaW5lZF9tc2cgPSBcIlwiKSAgXG5cbiMgUmVxdWlyZXMgYW4gb25qZWN0IHdpdGggdGhlIHNhbWUgdmFsdWUgb2YgYHhgIGluIHRoZSBzb2x1dGlvblxuIyB0ZXN0X2FuX29iamVjdChcInhcIix1bmRlZmluZWRfbXNnPVwiXCIpXG5cbiMgQ2hlY2tzIGlmIG91dHB1dCBvZiBzdHVkZW50J3MgY29kZSBjb250YWlucyBnaXZlbiBldmFsdWF0ZWQgZXhwcmVzc2lvblxuIyB0ZXN0X291dHB1dF9jb250YWlucyhcInhcIixpbmNvcnJlY3RfbXNnID0gXCJcIilcblxuIyBDaGVjayBpZiBhIHZlY3RvciBvZiBwcmVkZWZpbmVkIG9iamVjdHMgYXJlIHVuY2hhbmdlZFxuIyB0ZXN0X3ByZWRlZmluZWRfb2JqZWN0cyhjKCd4JywneScpLGluY29ycmVjdF9tc2c9XCJEb24ndCBvbnZlcndyaXRlIHRoZSBwcmVkZWZpbmVkIHZhcmlhYmxlc1wiKVxuXG4jIENoZWNrcyBmb3IgYSByZWdleCBwYXR0ZXJuIGluIHRyaGUgb3V0cHV0XG4jIHRlc3Rfb3V0cHV0X3JlZ2V4KHBhdHRlcm4sZml4ZWQ9RiwgdGltZXM9MSwgaW5jb3JyZWN0X21zZz0nJylcblxuIyBDYW4gY2hlY2sgYW4gYXJiaXRyYXJ5IGV4cHJlc3Npb24gYWNyb3NzIGJvdGggc29sdXRpb24gYW5kIHN0dWRlbnQgY29kZVxuI3Rlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJ0eXBlb2YoY29tcGFueV9uYW1lKVwiLCBpbmNvcnJlY3RfbXNnPVwiRGlkIHlvdSBzdG9yZSB0ZXh0dWFsIGRhdGEgaW4gYGNvbXBhbnlfbmFtZWA/XCIpXG5cbnRlc3RfZXJyb3IoKVxuc3VjY2Vzc19tc2coXCJBd2Vzb21lIVwiKVxuXG4jIE90aGVyIGZ1bmN0aW9ucyB0byBub3RlOlxuIyAgICAgLSB0ZXN0X29yKGEsYikgLS0gY2hlY2tzIGlmIGVpdGhlciB0ZXN0IGEgb3IgdGVzdCBiIHBhc3NcbiMgICAgIC0gdGVzdF9nZ3Bsb3QoKSAtLSBjYW4gY2hlY2sgaWYgcGxvdHMgYXJlIGNvcnJlY3RcbiMgICAgIC0gdGVzdF9mdW5jdGlvbigpIC0tIGNhbiBhbHNvIGNoZWNrIGluY2x1ZGVkIHBhcmFtZXRlcnNcbiMgICAgIC0gdGVzdF9sb29wKCkgLS0gY2hlY2tpbmcgZm9yIGFuZCB3aGlsZSBsb29wc1xuIyAgICAgLSB0ZXN0X2xpYnJhcnlfZnVuY3Rpb24oJ3BhY2thZ2UnLCBub3RfY2FsbGVkX21zZz0nJyxpbmNvcnJlY3RfbXNnPScnKVxuIyAgICAgLSB0ZXN0X2lmX2Vsc2UoKSAtLSBjaGVja2luZyBpZiBzdGF0ZW1lbnRzXG4jICAgICAtIHRlc3RfZXhwcmVzc2lvbl9lcnJvcigpIC0tIGNhbiBjaGVjayBpZiBmdW5jdGlvbnMgYXJlIHByb3Blcmx5IGRlZmluZWRcbiMgICAgIC0gdGVzdF9vcGVyYXRvcignb3BlcmF0b3InLCksIG5vdF9jYWxsZWRfbXNnPScnLGluY29ycmVjdF9tc2c9JycpXG4jICAgICAtIHRlc3RfZnVuY3Rpb25fZGVmaW5pdGlvbigpIC0tIHJpZ29yb3VzbHkgY2hlY2sgZGVmaW5lZCBmdW5jdGlvblxuIyAgICAgLSB0ZXN0X2RhdGFfZnJhbWUoKSAtLSBjaGVjayBpZiBkYXRhZnJhbWUgW2NvbHVtbnNdIGFyZSBlcXVpdmFsZW50XG4jICAgICAtIHRlc3RfZnVuY3Rpb25fcmVzdWx0LCB0ZXN0X2V4cHJlc3Npb25fcmVzdWx0In0=

Exercise 3: Searching for phrases

eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6ImxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KHN0cmluZ3IpXG5kb2MgPC0gXCJPbiBKdW5lIDUsIDIwMTMsIHRoZSBTRUMgYXBwcm92ZWQgdGhlIHB1YmxpY2F0aW9uIG9mIHByb3Bvc2VkIHN0cnVjdHVyYWwgcmVmb3JtcyBvZiBtb25leSBtYXJrZXQgZnVuZHMuIFRoZSBwcm9wb3NhbCBjb25zaWRlcmVkIHR3byByZWZvcm0gYWx0ZXJuYXRpdmVzIHRoYXQgY291bGQgYmUgYWRvcHRlZCBlaXRoZXIgYWxvbmUgb3IgaW4gY29tYmluYXRpb246IChpKSByZXF1aXJpbmcgcHJpbWUgYW5kIHRheC1leGVtcHQgaW5zdGl0dXRpb25hbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gZmxvYXQgdGhlaXIgbmV0IGFzc2V0IHZhbHVlcyBvciAoaWkpIHJlcXVpcmluZyBhbGwgbm9uLWdvdmVybm1lbnRhbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gaW1wb3NlIGxpcXVpZGl0eSBmZWVzIG9mIHVwIHRvIDIlIGFuZCB0byBoYXZlIHRoZSBvcHRpb24gdG8gdGVtcG9yYXJpbHkgc3VzcGVuZCByZWRlbXB0aW9ucyAob3IgZ2F0ZSB0aGUgbW9uZXkgbWFya2V0IGZ1bmQpIHVwb24gdGhlIG9jY3VycmVuY2Ugb2Ygc3BlY2lmaWVkIGV2ZW50cyBpbmRpY2F0aW5nIHRoYXQgdGhlIGZ1bmQgbWF5IGJlIHVuZGVyIHN0cmVzcy4gSXQgaXMgY3VycmVudGx5IGFudGljaXBhdGVkIHRoYXQgdGhlIFNFQyB3aWxsIGFkb3B0IGZpbmFsIHN0cnVjdHVyYWwgcmVmb3JtcyBpbiAyMDE0LiBUaGUgRmluYW5jaWFsIFN0YWJpbGl0eSBCb2FyZCAodGhlIEZTQikgaGFzIGVuZG9yc2VkIGFuZCBwdWJsaXNoZWQgZm9yIHB1YmxpYyBjb25zdWx0YXRpb24gMTUgcG9saWN5IHJlY29tbWVuZGF0aW9ucyBwcm9wb3NlZCBieSB0aGUgSW50ZXJuYXRpb25hbCBPcmdhbml6YXRpb24gb2YgU2VjdXJpdGllcyBDb21taXNzaW9ucywgaW5jbHVkaW5nIHJlcXVpcmluZyBtb25leSBtYXJrZXQgZnVuZHMgdG8gYWRvcHQgYSBmbG9hdGluZyBuZXQgYXNzZXQgdmFsdWUuIEluIGFkZGl0aW9uLCBpbiBTZXB0ZW1iZXIgMjAxMyB0aGUgRXVyb3BlYW4gQ29tbWlzc2lvbiAodGhlIEVDKSByZWxlYXNlZCBhIHByb3Bvc2FsIGZvciBhIG5ldyByZWd1bGF0aW9uIG9uIG1vbmV5IG1hcmtldCBmdW5kcyBpbiB0aGUgRVUuIFRoZSBFQyBwcm9wb3NlZCB0d28gb3B0aW9ucyBmb3Igc3RhYmxlIG5ldCBhc3NldCB2YWx1ZSBtb25leSBtYXJrZXQgZnVuZHM6IGVpdGhlciAoaSkgbWFpbnRhaW4gYSBjYXBpdGFsIGJ1ZmZlciBvZiBhdCBsZWFzdCB0aHJlZSBwZXJjZW50IG9mIGFzc2V0cyB1bmRlciBtYW5hZ2VtZW50IG9yIChpaSkgZmxvYXQgdGhlIG5ldCBhc3NldCB2YWx1ZSBvZiB0aGUgbW9uZXkgbWFya2V0IGZ1bmQuIFRoZSBFQyBwcm9wb3NhbCBpcyBjdXJyZW50bHkgYmVpbmcgcmV2aWV3ZWQgYnkgdGhlIEV1cm9wZWFuIFBhcmxpYW1lbnQgYW5kIHRoZSBDb3VuY2lsIG9mIE1lbWJlciBTdGF0ZXMgYXMgY28tbGVnaXNsYXRvcnMsIGFuZCBpcyBleHBlY3RlZCB0byBiZSBhcHByb3ZlZCBpbiAyMDE0LiBGb3IgZnVydGhlciBpbmZvcm1hdGlvbiBvbiBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMsIHNlZSBTaWduaWZpY2FudCBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMgb24gcGFnZXMgODkgLiBiYW5rIHN1YnNpZGlhcmllcyBwYXkgYW5udWFsbHkgdG8gdGhlIEZESUMuIEZvciBtb3JlIGluZm9ybWF0aW9uLCBzZWUgRGVwb3NpdCBpbnN1cmFuY2Ugb24gcGFnZSA2IC4gXCJcbmZpcnN0IDwtIHN0cl9zdWIoZG9jLCAxLCAxMDApIiwic2FtcGxlIjoiIyBUaGUgdmFyaWFibGUgZG9jIGNvbnRhaW5zIGEgcGFyYWdyYXBoIGZyb20gSlBNJ3MgMjAxNCBhbm51YWwgcmVwb3J0XG5jYXQoZG9jKVxuXG4jIEhvdyBtYW55IHRpbWVzIGRvZXMgdGhlIHBhcmFncmFwaCBtZW50aW9uIHRoZSBcIlNFQ1wiP1xuU0VDX21lbnRpb25zIDwtIFxuXG4jIFdoZXJlIGlzIHRoZSBTRUMgbWVudGlvbmVkPyAgKEFsbCBsb2NhdGlvbnMpXG5TRUNfbG9jYXRpb24gPC0gXG5cbiMgcHJpbnQgb3V0IHRoZSByZXN1bHRzXG5TRUNfbWVudGlvbnNcblNFQ19sb2NhdGlvblxuXG4jRU5EIiwic29sdXRpb24iOiIjIFRoZSB2YXJpYWJsZSBkb2MgY29udGFpbnMgYSBwYXJhZ3JhcGggZnJvbSBKUE0ncyAyMDE0IGFubnVhbCByZXBvcnRcbmNhdChkb2MpXG5cbiMgSG93IG1hbnkgdGltZXMgZG9lcyB0aGUgcGFyYWdyYXBoIG1lbnRpb24gdGhlIFwiU0VDXCI/XG5TRUNfbWVudGlvbnMgPC0gc3RyX2NvdW50KGRvYywgXCJTRUNcIilcblxuIyBXaGVyZSBpcyB0aGUgU0VDIG1lbnRpb25lZD8gIChBbGwgbG9jYXRpb25zKVxuU0VDX2xvY2F0aW9uIDwtIHN0cl9sb2NhdGVfYWxsKGRvYywgXCJTRUNcIilcblxuIyBwcmludCBvdXQgdGhlIHJlc3VsdHNcblNFQ19tZW50aW9uc1xuU0VDX2xvY2F0aW9uXG5cbiNFTkQiLCJzY3QiOiIjIFRlbXBsYXRlIGJhc2VkIG9uIGh0dHBzOi8vd3d3LnJkb2N1bWVudGF0aW9uLm9yZy9wYWNrYWdlcy90ZXN0d2hhdC92ZXJzaW9ucy80LjEuMVxuIyBDaGVjayBpZiBzb21ldGhpbmcgaXMgZXhwbGljaXRseSB0eXBlZFxuXG50ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwiU0VDX21lbnRpb25zXCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYFNFQ19tZW50aW9uc2AgaXNuJ3QgcXVpdGUgcmlnaHQuXCIpXG50ZXN0X3N0dWRlbnRfdHlwZWQoXCJzdHJfbG9jYXRlX2FsbFwiLCBub3RfdHlwZWRfbXNnPVwiRGlkIHlvdSByZW1lbWJlciB0byB1c2UgYHN0cl9sb2NhdGVfYWxsKClgIHNvIGFzIHRvIGdldCBBTEwgbWVudGlvbnMnIGxvY2F0aW9ucz9cIilcbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJTRUNfbG9jYXRpb25cIiwgaW5jb3JyZWN0X21zZz1cIkl0IGFwcGVhcnMgeW91ciBgU0VDX2xvY2F0aW9uYCBpc24ndCBxdWl0ZSByaWdodC5cIilcblxuIyB0ZXN0X3N0dWRlbnRfdHlwZWQoJ3ggPC0gMicsIG5vdF90eXBlZF9tc2c9JycpXG5cbiMgQ2hlY2sgaWYgZnVuY3Rpb24gd2FzIHVzZWQgaW4gaW5wdXQgY29kZVxuIyB0ZXN0X2Z1bmN0aW9uKCdjJyxpbmNvcnJlY3RfbXNnPScnKSAgXG5cbiMgUmVxdWlyZXMgYW4gb2JqZWN0IGB4YCB0byBoYXZlIHRoZSBzYW1lIHZhbHVlIGFzIHRoZSBzb2x1dGlvblxuIyB0ZXN0X29iamVjdChcInhcIixpbmNvcnJlY3RfbXNnID0gXCJcIix1bmRlZmluZWRfbXNnID0gXCJcIikgIFxuXG4jIFJlcXVpcmVzIGFuIG9uamVjdCB3aXRoIHRoZSBzYW1lIHZhbHVlIG9mIGB4YCBpbiB0aGUgc29sdXRpb25cbiMgdGVzdF9hbl9vYmplY3QoXCJ4XCIsdW5kZWZpbmVkX21zZz1cIlwiKVxuXG4jIENoZWNrcyBpZiBvdXRwdXQgb2Ygc3R1ZGVudCdzIGNvZGUgY29udGFpbnMgZ2l2ZW4gZXZhbHVhdGVkIGV4cHJlc3Npb25cbiMgdGVzdF9vdXRwdXRfY29udGFpbnMoXCJ4XCIsaW5jb3JyZWN0X21zZyA9IFwiXCIpXG5cbiMgQ2hlY2sgaWYgYSB2ZWN0b3Igb2YgcHJlZGVmaW5lZCBvYmplY3RzIGFyZSB1bmNoYW5nZWRcbiMgdGVzdF9wcmVkZWZpbmVkX29iamVjdHMoYygneCcsJ3knKSxpbmNvcnJlY3RfbXNnPVwiRG9uJ3Qgb252ZXJ3cml0ZSB0aGUgcHJlZGVmaW5lZCB2YXJpYWJsZXNcIilcblxuIyBDaGVja3MgZm9yIGEgcmVnZXggcGF0dGVybiBpbiB0cmhlIG91dHB1dFxuIyB0ZXN0X291dHB1dF9yZWdleChwYXR0ZXJuLGZpeGVkPUYsIHRpbWVzPTEsIGluY29ycmVjdF9tc2c9JycpXG5cbiMgQ2FuIGNoZWNrIGFuIGFyYml0cmFyeSBleHByZXNzaW9uIGFjcm9zcyBib3RoIHNvbHV0aW9uIGFuZCBzdHVkZW50IGNvZGVcbiN0ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwidHlwZW9mKGNvbXBhbnlfbmFtZSlcIiwgaW5jb3JyZWN0X21zZz1cIkRpZCB5b3Ugc3RvcmUgdGV4dHVhbCBkYXRhIGluIGBjb21wYW55X25hbWVgP1wiKVxuXG50ZXN0X2Vycm9yKClcbnN1Y2Nlc3NfbXNnKFwiQXdlc29tZSFcIilcblxuIyBPdGhlciBmdW5jdGlvbnMgdG8gbm90ZTpcbiMgICAgIC0gdGVzdF9vcihhLGIpIC0tIGNoZWNrcyBpZiBlaXRoZXIgdGVzdCBhIG9yIHRlc3QgYiBwYXNzXG4jICAgICAtIHRlc3RfZ2dwbG90KCkgLS0gY2FuIGNoZWNrIGlmIHBsb3RzIGFyZSBjb3JyZWN0XG4jICAgICAtIHRlc3RfZnVuY3Rpb24oKSAtLSBjYW4gYWxzbyBjaGVjayBpbmNsdWRlZCBwYXJhbWV0ZXJzXG4jICAgICAtIHRlc3RfbG9vcCgpIC0tIGNoZWNraW5nIGZvciBhbmQgd2hpbGUgbG9vcHNcbiMgICAgIC0gdGVzdF9saWJyYXJ5X2Z1bmN0aW9uKCdwYWNrYWdlJywgbm90X2NhbGxlZF9tc2c9JycsaW5jb3JyZWN0X21zZz0nJylcbiMgICAgIC0gdGVzdF9pZl9lbHNlKCkgLS0gY2hlY2tpbmcgaWYgc3RhdGVtZW50c1xuIyAgICAgLSB0ZXN0X2V4cHJlc3Npb25fZXJyb3IoKSAtLSBjYW4gY2hlY2sgaWYgZnVuY3Rpb25zIGFyZSBwcm9wZXJseSBkZWZpbmVkXG4jICAgICAtIHRlc3Rfb3BlcmF0b3IoJ29wZXJhdG9yJywpLCBub3RfY2FsbGVkX21zZz0nJyxpbmNvcnJlY3RfbXNnPScnKVxuIyAgICAgLSB0ZXN0X2Z1bmN0aW9uX2RlZmluaXRpb24oKSAtLSByaWdvcm91c2x5IGNoZWNrIGRlZmluZWQgZnVuY3Rpb25cbiMgICAgIC0gdGVzdF9kYXRhX2ZyYW1lKCkgLS0gY2hlY2sgaWYgZGF0YWZyYW1lIFtjb2x1bW5zXSBhcmUgZXF1aXZhbGVudFxuIyAgICAgLSB0ZXN0X2Z1bmN0aW9uX3Jlc3VsdCwgdGVzdF9leHByZXNzaW9uX3Jlc3VsdCJ9

Regular expressions

Exercise 4: Finding mentions by pattern

eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6ImxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KHN0cmluZ3IpXG5kb2MgPC0gXCJPbiBKdW5lIDUsIDIwMTMsIHRoZSBTRUMgYXBwcm92ZWQgdGhlIHB1YmxpY2F0aW9uIG9mIHByb3Bvc2VkIHN0cnVjdHVyYWwgcmVmb3JtcyBvZiBtb25leSBtYXJrZXQgZnVuZHMuIFRoZSBwcm9wb3NhbCBjb25zaWRlcmVkIHR3byByZWZvcm0gYWx0ZXJuYXRpdmVzIHRoYXQgY291bGQgYmUgYWRvcHRlZCBlaXRoZXIgYWxvbmUgb3IgaW4gY29tYmluYXRpb246IChpKSByZXF1aXJpbmcgcHJpbWUgYW5kIHRheC1leGVtcHQgaW5zdGl0dXRpb25hbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gZmxvYXQgdGhlaXIgbmV0IGFzc2V0IHZhbHVlcyBvciAoaWkpIHJlcXVpcmluZyBhbGwgbm9uLWdvdmVybm1lbnRhbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gaW1wb3NlIGxpcXVpZGl0eSBmZWVzIG9mIHVwIHRvIDIlIGFuZCB0byBoYXZlIHRoZSBvcHRpb24gdG8gdGVtcG9yYXJpbHkgc3VzcGVuZCByZWRlbXB0aW9ucyAob3IgZ2F0ZSB0aGUgbW9uZXkgbWFya2V0IGZ1bmQpIHVwb24gdGhlIG9jY3VycmVuY2Ugb2Ygc3BlY2lmaWVkIGV2ZW50cyBpbmRpY2F0aW5nIHRoYXQgdGhlIGZ1bmQgbWF5IGJlIHVuZGVyIHN0cmVzcy4gSXQgaXMgY3VycmVudGx5IGFudGljaXBhdGVkIHRoYXQgdGhlIFNFQyB3aWxsIGFkb3B0IGZpbmFsIHN0cnVjdHVyYWwgcmVmb3JtcyBpbiAyMDE0LiBUaGUgRmluYW5jaWFsIFN0YWJpbGl0eSBCb2FyZCAodGhlIEZTQikgaGFzIGVuZG9yc2VkIGFuZCBwdWJsaXNoZWQgZm9yIHB1YmxpYyBjb25zdWx0YXRpb24gMTUgcG9saWN5IHJlY29tbWVuZGF0aW9ucyBwcm9wb3NlZCBieSB0aGUgSW50ZXJuYXRpb25hbCBPcmdhbml6YXRpb24gb2YgU2VjdXJpdGllcyBDb21taXNzaW9ucywgaW5jbHVkaW5nIHJlcXVpcmluZyBtb25leSBtYXJrZXQgZnVuZHMgdG8gYWRvcHQgYSBmbG9hdGluZyBuZXQgYXNzZXQgdmFsdWUuIEluIGFkZGl0aW9uLCBpbiBTZXB0ZW1iZXIgMjAxMyB0aGUgRXVyb3BlYW4gQ29tbWlzc2lvbiAodGhlIEVDKSByZWxlYXNlZCBhIHByb3Bvc2FsIGZvciBhIG5ldyByZWd1bGF0aW9uIG9uIG1vbmV5IG1hcmtldCBmdW5kcyBpbiB0aGUgRVUuIFRoZSBFQyBwcm9wb3NlZCB0d28gb3B0aW9ucyBmb3Igc3RhYmxlIG5ldCBhc3NldCB2YWx1ZSBtb25leSBtYXJrZXQgZnVuZHM6IGVpdGhlciAoaSkgbWFpbnRhaW4gYSBjYXBpdGFsIGJ1ZmZlciBvZiBhdCBsZWFzdCB0aHJlZSBwZXJjZW50IG9mIGFzc2V0cyB1bmRlciBtYW5hZ2VtZW50IG9yIChpaSkgZmxvYXQgdGhlIG5ldCBhc3NldCB2YWx1ZSBvZiB0aGUgbW9uZXkgbWFya2V0IGZ1bmQuIFRoZSBFQyBwcm9wb3NhbCBpcyBjdXJyZW50bHkgYmVpbmcgcmV2aWV3ZWQgYnkgdGhlIEV1cm9wZWFuIFBhcmxpYW1lbnQgYW5kIHRoZSBDb3VuY2lsIG9mIE1lbWJlciBTdGF0ZXMgYXMgY28tbGVnaXNsYXRvcnMsIGFuZCBpcyBleHBlY3RlZCB0byBiZSBhcHByb3ZlZCBpbiAyMDE0LiBGb3IgZnVydGhlciBpbmZvcm1hdGlvbiBvbiBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMsIHNlZSBTaWduaWZpY2FudCBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMgb24gcGFnZXMgODkgLiBiYW5rIHN1YnNpZGlhcmllcyBwYXkgYW5udWFsbHkgdG8gdGhlIEZESUMuIEZvciBtb3JlIGluZm9ybWF0aW9uLCBzZWUgRGVwb3NpdCBpbnN1cmFuY2Ugb24gcGFnZSA2IC4gXCJcbmZpcnN0IDwtIHN0cl9zdWIoZG9jLCAxLCAxMDApIiwic2FtcGxlIjoiIyBUaGUgdmFyaWFibGUgZG9jIGNvbnRhaW5zIGEgcGFyYWdyYXBoIGZyb20gSlBNJ3MgMjAxNCBhbm51YWwgcmVwb3J0XG5jYXQoZG9jKVxuXG4jIHdoYXQgYXJlIHRoZSBwYWdlIHJlZmVyZW5jZXMgaW4gdGhlIHRleHQ/ICBUaGVzZSBhcmUgb2YgdGhlIGZvcm0gXCJwYWdlICNcIiBvclxuIyBcInBhZ2VzICNcIi5cbnBhZ2VfcmVmZXJlbmNlcyA8LSBzdHJfZXh0cmFjdF9hbGwoKVxuXG4jIEhvdyB3YXMgdGhlIFNFQyBtZW50aW9uZWQ/IEV4dHJhY3QgdGhlIHRleHQgZnJvbSB0d28gd29yZHMgYmVmb3JlIFwiU0VDXCIgaXNcbiMgbWVudGlvbmVkIHVudGlsIHR3byB3b3JkcyBhZnRlciB0aGUgU0VDIGlzIG1lbnRpb25lZFxuU0VDX3JlZmVyZW5jZXMgPC0gc3RyX2V4dHJhY3RfYWxsKClcblxuIyBwcmludCBvdXQgdGhlIHJlc3VsdHNcbnBhZ2VfcmVmZXJlbmNlc1xuU0VDX2xvY2F0aW9uXG5cbiNFTkQiLCJzb2x1dGlvbiI6IiMgVGhlIHZhcmlhYmxlIGRvYyBjb250YWlucyBhIHBhcmFncmFwaCBmcm9tIEpQTSdzIDIwMTQgYW5udWFsIHJlcG9ydFxuY2F0KGRvYylcblxuIyB3aGF0IGFyZSB0aGUgcGFnZSByZWZlcmVuY2VzIGluIHRoZSB0ZXh0PyAgVGhlc2UgYXJlIG9mIHRoZSBmb3JtIFwicGFnZSAjXCIgb3JcbiMgXCJwYWdlcyAjXCIuXG5wYWdlX3JlZmVyZW5jZXMgPC0gc3RyX2V4dHJhY3RfYWxsKGRvYywgXCJwYWdlcz9cXFxcc1s6ZGlnaXQ6XStcIilcblxuIyBIb3cgd2FzIHRoZSBTRUMgbWVudGlvbmVkPyBFeHRyYWN0IHRoZSB0ZXh0IGZyb20gdHdvIHdvcmRzIGJlZm9yZSBcIlNFQ1wiIGlzXG4jIG1lbnRpb25lZCB1bnRpbCB0d28gd29yZHMgYWZ0ZXIgdGhlIFNFQyBpcyBtZW50aW9uZWRcblNFQ19yZWZlcmVuY2VzIDwtIHN0cl9leHRyYWN0X2FsbChkb2MsIFwiWzpncmFwaDpdK1xcXFxzWzpncmFwaDpdK1xcXFxzU0VDXFxcXHNbOmdyYXBoOl0rK1xcXFxzWzpncmFwaDpdK1wiKVxuXG4jIHByaW50IG91dCB0aGUgcmVzdWx0c1xucGFnZV9yZWZlcmVuY2VzXG5TRUNfbG9jYXRpb25cblxuI0VORCIsInNjdCI6IiMgVGVtcGxhdGUgYmFzZWQgb24gaHR0cHM6Ly93d3cucmRvY3VtZW50YXRpb24ub3JnL3BhY2thZ2VzL3Rlc3R3aGF0L3ZlcnNpb25zLzQuMS4xXG4jIENoZWNrIGlmIHNvbWV0aGluZyBpcyBleHBsaWNpdGx5IHR5cGVkXG5cbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJsZW5ndGgocGFnZV9yZWZlcmVuY2VzW1sxXV0pXCIsIGluY29ycmVjdF9tc2c9XCJgcGFnZV9yZWZlcmVuY2VzYCBzaG91bGQgZW5kIHVwIHdpdGggMiBtYXRjaGVzLlwiKVxudGVzdF9leHByZXNzaW9uX291dHB1dChcInBhZ2VfcmVmZXJlbmNlc1wiLCBpbmNvcnJlY3RfbXNnPVwiSXQgYXBwZWFycyB5b3VyIGBTRUNfbWVudGlvbnNgIGlzbid0IHF1aXRlIHJpZ2h0LCBidXQgaXQgaGFzIHRoZSByaWdodCBudW1iZXIgb2YgbWF0Y2hlcy5cIilcbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJsZW5ndGgoU0VDX3JlZmVyZW5jZXNbWzFdXSlcIiwgaW5jb3JyZWN0X21zZz1cImBTRUNfcmVmZXJlbmNlc2Agc2hvdWxkIGVuZCB1cCB3aXRoIDIgbWF0Y2hlcy5cIilcbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJTRUNfcmVmZXJlbmNlc1wiLCBpbmNvcnJlY3RfbXNnPVwiSXQgYXBwZWFycyB5b3VyIGBTRUNfbG9jYXRpb25gIGlzbid0IHF1aXRlIHJpZ2h0LCBidXQgaXQgaGFzIHRoZSByaWdodCBudW1iZXIgb2YgbWF0Y2hlcy5cIilcblxuIyB0ZXN0X3N0dWRlbnRfdHlwZWQoJ3ggPC0gMicsIG5vdF90eXBlZF9tc2c9JycpXG5cbiMgQ2hlY2sgaWYgZnVuY3Rpb24gd2FzIHVzZWQgaW4gaW5wdXQgY29kZVxuIyB0ZXN0X2Z1bmN0aW9uKCdjJyxpbmNvcnJlY3RfbXNnPScnKSAgXG5cbiMgUmVxdWlyZXMgYW4gb2JqZWN0IGB4YCB0byBoYXZlIHRoZSBzYW1lIHZhbHVlIGFzIHRoZSBzb2x1dGlvblxuIyB0ZXN0X29iamVjdChcInhcIixpbmNvcnJlY3RfbXNnID0gXCJcIix1bmRlZmluZWRfbXNnID0gXCJcIikgIFxuXG4jIFJlcXVpcmVzIGFuIG9uamVjdCB3aXRoIHRoZSBzYW1lIHZhbHVlIG9mIGB4YCBpbiB0aGUgc29sdXRpb25cbiMgdGVzdF9hbl9vYmplY3QoXCJ4XCIsdW5kZWZpbmVkX21zZz1cIlwiKVxuXG4jIENoZWNrcyBpZiBvdXRwdXQgb2Ygc3R1ZGVudCdzIGNvZGUgY29udGFpbnMgZ2l2ZW4gZXZhbHVhdGVkIGV4cHJlc3Npb25cbiMgdGVzdF9vdXRwdXRfY29udGFpbnMoXCJ4XCIsaW5jb3JyZWN0X21zZyA9IFwiXCIpXG5cbiMgQ2hlY2sgaWYgYSB2ZWN0b3Igb2YgcHJlZGVmaW5lZCBvYmplY3RzIGFyZSB1bmNoYW5nZWRcbiMgdGVzdF9wcmVkZWZpbmVkX29iamVjdHMoYygneCcsJ3knKSxpbmNvcnJlY3RfbXNnPVwiRG9uJ3Qgb252ZXJ3cml0ZSB0aGUgcHJlZGVmaW5lZCB2YXJpYWJsZXNcIilcblxuIyBDaGVja3MgZm9yIGEgcmVnZXggcGF0dGVybiBpbiB0cmhlIG91dHB1dFxuIyB0ZXN0X291dHB1dF9yZWdleChwYXR0ZXJuLGZpeGVkPUYsIHRpbWVzPTEsIGluY29ycmVjdF9tc2c9JycpXG5cbiMgQ2FuIGNoZWNrIGFuIGFyYml0cmFyeSBleHByZXNzaW9uIGFjcm9zcyBib3RoIHNvbHV0aW9uIGFuZCBzdHVkZW50IGNvZGVcbiN0ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwidHlwZW9mKGNvbXBhbnlfbmFtZSlcIiwgaW5jb3JyZWN0X21zZz1cIkRpZCB5b3Ugc3RvcmUgdGV4dHVhbCBkYXRhIGluIGBjb21wYW55X25hbWVgP1wiKVxuXG50ZXN0X2Vycm9yKClcbnN1Y2Nlc3NfbXNnKFwiQXdlc29tZSFcIilcblxuIyBPdGhlciBmdW5jdGlvbnMgdG8gbm90ZTpcbiMgICAgIC0gdGVzdF9vcihhLGIpIC0tIGNoZWNrcyBpZiBlaXRoZXIgdGVzdCBhIG9yIHRlc3QgYiBwYXNzXG4jICAgICAtIHRlc3RfZ2dwbG90KCkgLS0gY2FuIGNoZWNrIGlmIHBsb3RzIGFyZSBjb3JyZWN0XG4jICAgICAtIHRlc3RfZnVuY3Rpb24oKSAtLSBjYW4gYWxzbyBjaGVjayBpbmNsdWRlZCBwYXJhbWV0ZXJzXG4jICAgICAtIHRlc3RfbG9vcCgpIC0tIGNoZWNraW5nIGZvciBhbmQgd2hpbGUgbG9vcHNcbiMgICAgIC0gdGVzdF9saWJyYXJ5X2Z1bmN0aW9uKCdwYWNrYWdlJywgbm90X2NhbGxlZF9tc2c9JycsaW5jb3JyZWN0X21zZz0nJylcbiMgICAgIC0gdGVzdF9pZl9lbHNlKCkgLS0gY2hlY2tpbmcgaWYgc3RhdGVtZW50c1xuIyAgICAgLSB0ZXN0X2V4cHJlc3Npb25fZXJyb3IoKSAtLSBjYW4gY2hlY2sgaWYgZnVuY3Rpb25zIGFyZSBwcm9wZXJseSBkZWZpbmVkXG4jICAgICAtIHRlc3Rfb3BlcmF0b3IoJ29wZXJhdG9yJywpLCBub3RfY2FsbGVkX21zZz0nJyxpbmNvcnJlY3RfbXNnPScnKVxuIyAgICAgLSB0ZXN0X2Z1bmN0aW9uX2RlZmluaXRpb24oKSAtLSByaWdvcm91c2x5IGNoZWNrIGRlZmluZWQgZnVuY3Rpb25cbiMgICAgIC0gdGVzdF9kYXRhX2ZyYW1lKCkgLS0gY2hlY2sgaWYgZGF0YWZyYW1lIFtjb2x1bW5zXSBhcmUgZXF1aXZhbGVudFxuIyAgICAgLSB0ZXN0X2Z1bmN0aW9uX3Jlc3VsdCwgdGVzdF9leHByZXNzaW9uX3Jlc3VsdCJ9

Exercide 5: Further regex practice

eyJsYW5ndWFnZSI6InIiLCJwcmVfZXhlcmNpc2VfY29kZSI6ImxpYnJhcnkoZHBseXIpXG5saWJyYXJ5KHN0cmluZ3IpXG5kb2MgPC0gXCJPbiBKdW5lIDUsIDIwMTMsIHRoZSBTRUMgYXBwcm92ZWQgdGhlIHB1YmxpY2F0aW9uIG9mIHByb3Bvc2VkIHN0cnVjdHVyYWwgcmVmb3JtcyBvZiBtb25leSBtYXJrZXQgZnVuZHMuIFRoZSBwcm9wb3NhbCBjb25zaWRlcmVkIHR3byByZWZvcm0gYWx0ZXJuYXRpdmVzIHRoYXQgY291bGQgYmUgYWRvcHRlZCBlaXRoZXIgYWxvbmUgb3IgaW4gY29tYmluYXRpb246IChpKSByZXF1aXJpbmcgcHJpbWUgYW5kIHRheC1leGVtcHQgaW5zdGl0dXRpb25hbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gZmxvYXQgdGhlaXIgbmV0IGFzc2V0IHZhbHVlcyBvciAoaWkpIHJlcXVpcmluZyBhbGwgbm9uLWdvdmVybm1lbnRhbCBtb25leSBtYXJrZXQgZnVuZHMgdG8gaW1wb3NlIGxpcXVpZGl0eSBmZWVzIG9mIHVwIHRvIDIlIGFuZCB0byBoYXZlIHRoZSBvcHRpb24gdG8gdGVtcG9yYXJpbHkgc3VzcGVuZCByZWRlbXB0aW9ucyAob3IgZ2F0ZSB0aGUgbW9uZXkgbWFya2V0IGZ1bmQpIHVwb24gdGhlIG9jY3VycmVuY2Ugb2Ygc3BlY2lmaWVkIGV2ZW50cyBpbmRpY2F0aW5nIHRoYXQgdGhlIGZ1bmQgbWF5IGJlIHVuZGVyIHN0cmVzcy4gSXQgaXMgY3VycmVudGx5IGFudGljaXBhdGVkIHRoYXQgdGhlIFNFQyB3aWxsIGFkb3B0IGZpbmFsIHN0cnVjdHVyYWwgcmVmb3JtcyBpbiAyMDE0LiBUaGUgRmluYW5jaWFsIFN0YWJpbGl0eSBCb2FyZCAodGhlIEZTQikgaGFzIGVuZG9yc2VkIGFuZCBwdWJsaXNoZWQgZm9yIHB1YmxpYyBjb25zdWx0YXRpb24gMTUgcG9saWN5IHJlY29tbWVuZGF0aW9ucyBwcm9wb3NlZCBieSB0aGUgSW50ZXJuYXRpb25hbCBPcmdhbml6YXRpb24gb2YgU2VjdXJpdGllcyBDb21taXNzaW9ucywgaW5jbHVkaW5nIHJlcXVpcmluZyBtb25leSBtYXJrZXQgZnVuZHMgdG8gYWRvcHQgYSBmbG9hdGluZyBuZXQgYXNzZXQgdmFsdWUuIEluIGFkZGl0aW9uLCBpbiBTZXB0ZW1iZXIgMjAxMyB0aGUgRXVyb3BlYW4gQ29tbWlzc2lvbiAodGhlIEVDKSByZWxlYXNlZCBhIHByb3Bvc2FsIGZvciBhIG5ldyByZWd1bGF0aW9uIG9uIG1vbmV5IG1hcmtldCBmdW5kcyBpbiB0aGUgRVUuIFRoZSBFQyBwcm9wb3NlZCB0d28gb3B0aW9ucyBmb3Igc3RhYmxlIG5ldCBhc3NldCB2YWx1ZSBtb25leSBtYXJrZXQgZnVuZHM6IGVpdGhlciAoaSkgbWFpbnRhaW4gYSBjYXBpdGFsIGJ1ZmZlciBvZiBhdCBsZWFzdCB0aHJlZSBwZXJjZW50IG9mIGFzc2V0cyB1bmRlciBtYW5hZ2VtZW50IG9yIChpaSkgZmxvYXQgdGhlIG5ldCBhc3NldCB2YWx1ZSBvZiB0aGUgbW9uZXkgbWFya2V0IGZ1bmQuIFRoZSBFQyBwcm9wb3NhbCBpcyBjdXJyZW50bHkgYmVpbmcgcmV2aWV3ZWQgYnkgdGhlIEV1cm9wZWFuIFBhcmxpYW1lbnQgYW5kIHRoZSBDb3VuY2lsIG9mIE1lbWJlciBTdGF0ZXMgYXMgY28tbGVnaXNsYXRvcnMsIGFuZCBpcyBleHBlY3RlZCB0byBiZSBhcHByb3ZlZCBpbiAyMDE0LiBGb3IgZnVydGhlciBpbmZvcm1hdGlvbiBvbiBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMsIHNlZSBTaWduaWZpY2FudCBpbnRlcm5hdGlvbmFsIHJlZ3VsYXRvcnkgaW5pdGlhdGl2ZXMgb24gcGFnZXMgODkgLiBiYW5rIHN1YnNpZGlhcmllcyBwYXkgYW5udWFsbHkgdG8gdGhlIEZESUMuIEZvciBtb3JlIGluZm9ybWF0aW9uLCBzZWUgRGVwb3NpdCBpbnN1cmFuY2Ugb24gcGFnZSA2IC4gXCJcbmZpcnN0IDwtIHN0cl9zdWIoZG9jLCAxLCAxMDApIiwic2FtcGxlIjoiIyBUaGUgdmFyaWFibGUgZG9jIGNvbnRhaW5zIGEgcGFyYWdyYXBoIGZyb20gSlBNJ3MgMjAxNCBhbm51YWwgcmVwb3J0XG5jYXQoZG9jKVxuXG4jIEZpbmQgYWxsIHdvcmRzIHRoYXQgaW5jbHVkZSBhIGRvdWJsZWQgbGV0dGVyLCBsaWtlIFwic3NcIiBvciBcIm9vXCJcbiMgXCJwYWdlcyAjXCIuXG5yZWdleDEgPC0gc3RyX2V4dHJhY3RfYWxsKClcblxuIyBGaW5kIGFsbCBudW1iZXJzIChqdXN0IHRoZSBudW1iZXIpXG5yZWdleDIgPC0gc3RyX2V4dHJhY3RfYWxsKClcblxuIyBGaW5kIGFsbCBwaHJhc2VzIGluIHBhcmVudGhlc2VzLCBpbmNsdWRpbmcgdGhlIHBhcmVudGhlc2VzIHRoZW1zZWx2ZXNcbnJlZ2V4MyA8LSBzdHJfZXh0cmFjdF9hbGwoKVxuXG4jIHByaW50IG91dCB0aGUgcmVzdWx0c1xucmVnZXgxXG5yZWdleDJcbnJlZ2V4M1xuXG4jRU5EIiwic29sdXRpb24iOiIjIFRoZSB2YXJpYWJsZSBkb2MgY29udGFpbnMgYSBwYXJhZ3JhcGggZnJvbSBKUE0ncyAyMDE0IGFubnVhbCByZXBvcnRcbmNhdChkb2MpXG5cbiMgRmluZCBhbGwgd29yZHMgdGhhdCBpbmNsdWRlIGEgZG91YmxlZCBsZXR0ZXIsIGxpa2UgXCJzc1wiIG9yIFwib29cIlxuIyBcInBhZ2VzICNcIi5cbnJlZ2V4MSA8LSBzdHJfZXh0cmFjdF9hbGwoZG9jLCBcIls6YWxwaGE6XSooWzphbHBoYTpdKVxcXFwxWzphbHBoYTpdKlwiKVxuXG4jIEZpbmQgYWxsIG51bWJlcnMgKGp1c3QgdGhlIG51bWJlcilcbnJlZ2V4MiA8LSBzdHJfZXh0cmFjdF9hbGwoZG9jLCBcIls6ZGlnaXQ6XStcIilcblxuIyBGaW5kIGFsbCBwaHJhc2VzIGluIHBhcmVudGhlc2VzLCBpbmNsdWRpbmcgdGhlIHBhcmVudGhlc2VzIHRoZW1zZWx2ZXNcbnJlZ2V4MyA8LSBzdHJfZXh0cmFjdF9hbGwoZG9jLCBcIlxcXFwoLio/XFxcXClcIilcblxuIyBwcmludCBvdXQgdGhlIHJlc3VsdHNcbnJlZ2V4MVxucmVnZXgyXG5yZWdleDNcblxuI0VORCIsInNjdCI6IiMgVGVtcGxhdGUgYmFzZWQgb24gaHR0cHM6Ly93d3cucmRvY3VtZW50YXRpb24ub3JnL3BhY2thZ2VzL3Rlc3R3aGF0L3ZlcnNpb25zLzQuMS4xXG4jIENoZWNrIGlmIHNvbWV0aGluZyBpcyBleHBsaWNpdGx5IHR5cGVkXG5cbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJsZW5ndGgocmVnZXgxW1sxXV0pXCIsIGluY29ycmVjdF9tc2c9XCJgcmVnZXgxYCBoYXMgYW4gaW5jb3JyZWN0IG51bWJlciBvZiBtYXRjaGVzXCIpXG50ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwicmVnZXgxXCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYHJlZ2V4MWAgaXNuJ3QgcXVpdGUgcmlnaHQsIGJ1dCBpdCBoYXMgdGhlIHJpZ2h0IG51bWJlciBvZiBtYXRjaGVzLlwiKVxudGVzdF9leHByZXNzaW9uX291dHB1dChcImxlbmd0aChyZWdleDJbWzFdXSlcIiwgaW5jb3JyZWN0X21zZz1cImByZWdleDJgIGhhcyBhbiBpbmNvcnJlY3QgbnVtYmVyIG9mIG1hdGNoZXNcIilcbnRlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJyZWdleDJcIiwgaW5jb3JyZWN0X21zZz1cIkl0IGFwcGVhcnMgeW91ciBgcmVnZXgyYCBpc24ndCBxdWl0ZSByaWdodCwgYnV0IGl0IGhhcyB0aGUgcmlnaHQgbnVtYmVyIG9mIG1hdGNoZXMuXCIpXG50ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwibGVuZ3RoKHJlZ2V4M1tbMV1dKVwiLCBpbmNvcnJlY3RfbXNnPVwiYHJlZ2V4M2AgaGFzIGFuIGluY29ycmVjdCBudW1iZXIgb2YgbWF0Y2hlcy4gIERpZCB5b3UgbWFrZSBzdXJlIHRvIGtlZXAgdGhlIHRleHQgc2hvcnQ/XCIpXG50ZXN0X2V4cHJlc3Npb25fb3V0cHV0KFwicmVnZXgzXCIsIGluY29ycmVjdF9tc2c9XCJJdCBhcHBlYXJzIHlvdXIgYHJlZ2V4M2AgaXNuJ3QgcXVpdGUgcmlnaHQsIGJ1dCBpdCBoYXMgdGhlIHJpZ2h0IG51bWJlciBvZiBtYXRjaGVzLlwiKVxuXG4jIHRlc3Rfc3R1ZGVudF90eXBlZCgneCA8LSAyJywgbm90X3R5cGVkX21zZz0nJylcblxuIyBDaGVjayBpZiBmdW5jdGlvbiB3YXMgdXNlZCBpbiBpbnB1dCBjb2RlXG4jIHRlc3RfZnVuY3Rpb24oJ2MnLGluY29ycmVjdF9tc2c9JycpICBcblxuIyBSZXF1aXJlcyBhbiBvYmplY3QgYHhgIHRvIGhhdmUgdGhlIHNhbWUgdmFsdWUgYXMgdGhlIHNvbHV0aW9uXG4jIHRlc3Rfb2JqZWN0KFwieFwiLGluY29ycmVjdF9tc2cgPSBcIlwiLHVuZGVmaW5lZF9tc2cgPSBcIlwiKSAgXG5cbiMgUmVxdWlyZXMgYW4gb25qZWN0IHdpdGggdGhlIHNhbWUgdmFsdWUgb2YgYHhgIGluIHRoZSBzb2x1dGlvblxuIyB0ZXN0X2FuX29iamVjdChcInhcIix1bmRlZmluZWRfbXNnPVwiXCIpXG5cbiMgQ2hlY2tzIGlmIG91dHB1dCBvZiBzdHVkZW50J3MgY29kZSBjb250YWlucyBnaXZlbiBldmFsdWF0ZWQgZXhwcmVzc2lvblxuIyB0ZXN0X291dHB1dF9jb250YWlucyhcInhcIixpbmNvcnJlY3RfbXNnID0gXCJcIilcblxuIyBDaGVjayBpZiBhIHZlY3RvciBvZiBwcmVkZWZpbmVkIG9iamVjdHMgYXJlIHVuY2hhbmdlZFxuIyB0ZXN0X3ByZWRlZmluZWRfb2JqZWN0cyhjKCd4JywneScpLGluY29ycmVjdF9tc2c9XCJEb24ndCBvbnZlcndyaXRlIHRoZSBwcmVkZWZpbmVkIHZhcmlhYmxlc1wiKVxuXG4jIENoZWNrcyBmb3IgYSByZWdleCBwYXR0ZXJuIGluIHRyaGUgb3V0cHV0XG4jIHRlc3Rfb3V0cHV0X3JlZ2V4KHBhdHRlcm4sZml4ZWQ9RiwgdGltZXM9MSwgaW5jb3JyZWN0X21zZz0nJylcblxuIyBDYW4gY2hlY2sgYW4gYXJiaXRyYXJ5IGV4cHJlc3Npb24gYWNyb3NzIGJvdGggc29sdXRpb24gYW5kIHN0dWRlbnQgY29kZVxuI3Rlc3RfZXhwcmVzc2lvbl9vdXRwdXQoXCJ0eXBlb2YoY29tcGFueV9uYW1lKVwiLCBpbmNvcnJlY3RfbXNnPVwiRGlkIHlvdSBzdG9yZSB0ZXh0dWFsIGRhdGEgaW4gYGNvbXBhbnlfbmFtZWA/XCIpXG5cbnRlc3RfZXJyb3IoKVxuc3VjY2Vzc19tc2coXCJBd2Vzb21lIVwiKVxuXG4jIE90aGVyIGZ1bmN0aW9ucyB0byBub3RlOlxuIyAgICAgLSB0ZXN0X29yKGEsYikgLS0gY2hlY2tzIGlmIGVpdGhlciB0ZXN0IGEgb3IgdGVzdCBiIHBhc3NcbiMgICAgIC0gdGVzdF9nZ3Bsb3QoKSAtLSBjYW4gY2hlY2sgaWYgcGxvdHMgYXJlIGNvcnJlY3RcbiMgICAgIC0gdGVzdF9mdW5jdGlvbigpIC0tIGNhbiBhbHNvIGNoZWNrIGluY2x1ZGVkIHBhcmFtZXRlcnNcbiMgICAgIC0gdGVzdF9sb29wKCkgLS0gY2hlY2tpbmcgZm9yIGFuZCB3aGlsZSBsb29wc1xuIyAgICAgLSB0ZXN0X2xpYnJhcnlfZnVuY3Rpb24oJ3BhY2thZ2UnLCBub3RfY2FsbGVkX21zZz0nJyxpbmNvcnJlY3RfbXNnPScnKVxuIyAgICAgLSB0ZXN0X2lmX2Vsc2UoKSAtLSBjaGVja2luZyBpZiBzdGF0ZW1lbnRzXG4jICAgICAtIHRlc3RfZXhwcmVzc2lvbl9lcnJvcigpIC0tIGNhbiBjaGVjayBpZiBmdW5jdGlvbnMgYXJlIHByb3Blcmx5IGRlZmluZWRcbiMgICAgIC0gdGVzdF9vcGVyYXRvcignb3BlcmF0b3InLCksIG5vdF9jYWxsZWRfbXNnPScnLGluY29ycmVjdF9tc2c9JycpXG4jICAgICAtIHRlc3RfZnVuY3Rpb25fZGVmaW5pdGlvbigpIC0tIHJpZ29yb3VzbHkgY2hlY2sgZGVmaW5lZCBmdW5jdGlvblxuIyAgICAgLSB0ZXN0X2RhdGFfZnJhbWUoKSAtLSBjaGVjayBpZiBkYXRhZnJhbWUgW2NvbHVtbnNdIGFyZSBlcXVpdmFsZW50XG4jICAgICAtIHRlc3RfZnVuY3Rpb25fcmVzdWx0LCB0ZXN0X2V4cHJlc3Npb25fcmVzdWx0In0=

Calculating quantities with text

Note: Due to missing packages in DataCamp light, namely quanteda and tidytext, I have provided sample code that you can run on your own computer in RStudio. Make sure to run install.packages("quanteda") and install.packages("tidytext") to install those packages if you don’t have them.

Each of the three exercises below can be run as standalone scripts, as they contain all needed imports within their code blocks

Code for this section can be downloaded as an Rmd file here. The output of this code can be viewed in this Rmarkdown notebook.

Exercise 6: Readability with Quanteda

How does the readability of JPMorgan’s annual report compare to the Citigroup annual report from class?

eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiIjIGxvYWQgaW4gcmVhZHIgKG9yIHRpZHl2ZXJzZSkgdG8gZ2V0IHJlYWRfZmlsZSgpIGZ1bmN0aW9uXG5saWJyYXJ5KHJlYWRyKVxuXG4jIExvYWQgaW4gYWxsIG9mIEpQTSdzIDIwMTQgYW5udWFsIHJlcG9ydFxuZG9jIDwtIHJlYWRfZmlsZShcImh0dHBzOi8vcm1jLmxpbmsvU2xpZGVzL2FjY3Q0MjAvU2Vzc2lvbl84LzAwMDAwMTk2MTctMTQtMDAwMjg5LnR4dFwiKVxuXG4jIExvYWQgaW4gcXVhbnRlZGFcbmxpYnJhcnkocXVhbnRlZGEpXG5cbiMgQ2FsY3VsYXRlIHRoZSB0aHJlZSByZWFkYWJpbGl0eSBtZWFzdXJlc1xudGV4dHN0YXRfcmVhZGFiaWxpdHkoZG9jLCBcIkZsZXNjaC5LaW5jYWlkXCIpXG50ZXh0c3RhdF9yZWFkYWJpbGl0eShkb2MsIFwiRk9HXCIpXG50ZXh0c3RhdF9yZWFkYWJpbGl0eShkb2MsIFwiQ29sZW1hbi5MaWF1XCIpXG5cbiNFTkQifQ==

Exercise 7: Readability with Quanteda

How does the sentiment of JPMorgan’s annual report compare to the Citigroup annual report from class?

eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiIjIGxvYWQgaW4gcmVhZHIgKG9yIHRpZHl2ZXJzZSkgdG8gZ2V0IHJlYWRfZmlsZSgpIGZ1bmN0aW9uXG5saWJyYXJ5KHJlYWRyKVxuXG4jIExvYWQgaW4gYWxsIG9mIEpQTSdzIDIwMTQgYW5udWFsIHJlcG9ydFxuZG9jIDwtIHJlYWRfZmlsZShcImh0dHBzOi8vcm1jLmxpbmsvU2xpZGVzL2FjY3Q0MjAvU2Vzc2lvbl84LzAwMDAwMTk2MTctMTQtMDAwMjg5LnR4dFwiKVxuXG4jIExvYWQgaW4gdGlkeXRleHRcbmxpYnJhcnkodGlkeXRleHQpXG5cbiMgTG9hZCBzb21lIGNvbXBvbmVudHMgb2YgdGlkeXZlcnNlXG5saWJyYXJ5KGRwbHlyKSAgIyBmb3IgdGhlIHVzdWFsIGNvbW1hbmRzXG5saWJyYXJ5KHRpZHlyKSAgIyBmb3Igc3ByZWFkXG5cbiMgY29udmVydCBkb2N1bWVudCB0byB0aWR5IGZvcm1hdFxuZGZfZG9jIDwtIGRhdGEuZnJhbWUoSUQ9YyhcIjAwMDAwMTk2MTctMTQtMDAwMjg5XCIpLCB0ZXh0PWMoZG9jKSxcbiAgICAgICAgICAgICAgICAgICAgIHN0cmluZ3NBc0ZhY3RvcnMgPSBGKSAlPiVcbiAgdW5uZXN0X3Rva2Vucyh3b3JkLCB0ZXh0KVxuXG4jIENhbGN1bGF0ZSB0ZXJtIGZyZXF1ZW5jeVxudGVybXMgPC0gZGZfZG9jICU+JVxuICBjb3VudChJRCwgd29yZCwgc29ydD1UUlVFKSAlPiVcbiAgdW5ncm91cCgpXG50b3RhbF90ZXJtcyA8LSB0ZXJtcyAlPiUgXG4gIGdyb3VwX2J5KElEKSAlPiUgXG4gIHN1bW1hcml6ZSh0b3RhbCA9IHN1bShuKSlcbnRmIDwtIGxlZnRfam9pbih0ZXJtcywgdG90YWxfdGVybXMpICU+JSBtdXRhdGUodGY9bi90b3RhbClcblxuIyBHZXQgdGhlIExvdWdocmFuIE1jRG9uYWxkIHNlbnRpbWVudCBkaWN0aW9uYXJ5XG5zZW50aW1lbnQgPC0gZ2V0X3NlbnRpbWVudHMoXCJsb3VnaHJhblwiKVxuXG4jIE1lcmdlIGluIHNlbnRpbWVudFxudGZfc2VudCA8LSB0ZiAlPiUgbGVmdF9qb2luKHNlbnRpbWVudClcblxuIyBDYWxjdWxhdGUgdGhlIHRocmVlIHJlYWRhYmlsaXR5IG1lYXN1cmVzXG50Zl9zZW50ICU+JVxuICBzcHJlYWQoc2VudGltZW50LCB0ZiwgZmlsbD0wKSAlPiVcbiAgc2VsZWN0KGNvbnN0cmFpbmluZywgbGl0aWdpb3VzLCBuZWdhdGl2ZSwgcG9zaXRpdmUsIHN1cGVyZmx1b3VzLCB1bmNlcnRhaW50eSkgJT4lXG4gIGNvbFN1bXMoKVxuXG4jRU5EIn0=

Exercise 8: Make a word cloud after removing stopwords

eyJsYW5ndWFnZSI6InIiLCJzYW1wbGUiOiIjIGxvYWQgaW4gcmVhZHIgKG9yIHRpZHl2ZXJzZSkgdG8gZ2V0IHJlYWRfZmlsZSgpIGZ1bmN0aW9uXG5saWJyYXJ5KHJlYWRyKVxuXG4jIExvYWQgaW4gYWxsIG9mIEpQTSdzIDIwMTQgYW5udWFsIHJlcG9ydFxuZG9jIDwtIHJlYWRfZmlsZShcImh0dHBzOi8vcm1jLmxpbmsvU2xpZGVzL2FjY3Q0MjAvU2Vzc2lvbl84LzAwMDAwMTk2MTctMTQtMDAwMjg5LnR4dFwiKVxuXG4jIExvYWQgaW4gcXVhbnRlZGEgYW5kIHRpZHl0ZXh0XG5saWJyYXJ5KHF1YW50ZWRhKVxubGlicmFyeSh0aWR5dGV4dClcblxuIyBMb2FkIGluIHNvbWUgb2YgdGlkeXZlcnNlXG5saWJyYXJ5KGRwbHlyKVxuXG4jIGNvbnZlcnQgZG9jdW1lbnQgdG8gdGlkeSBmb3JtYXRcbmRmX2RvYyA8LSBkYXRhLmZyYW1lKElEPWMoXCIwMDAwMDE5NjE3LTE0LTAwMDI4OVwiKSwgdGV4dD1jKGRvYyksXG4gICAgICAgICAgICAgICAgICAgICBzdHJpbmdzQXNGYWN0b3JzID0gRikgJT4lXG4gIHVubmVzdF90b2tlbnMod29yZCwgdGV4dClcblxuIyBQdWxsIGEgbGlzdCBvZiBzdG9wd29yZHNcbnN0b3B3b3JkcyA8LSBzdG9wd29yZHMoc291cmNlPVwic21hcnRcIilcblxuIyBSZW1vdmUgc3RvcHdvcmRzXG5kZl9kb2Nfc3RvcCA8LSBkZl9kb2MgJT4lXG4gIGFudGlfam9pbihkYXRhLmZyYW1lKHdvcmQ9c3RvcHdvcmRzLCBzdHJpbmdzQXNGYWN0b3JzPUYpKVxuXG4jIEJ1aWxkIGEgY29ycHVzIG9iamVjdCBmb3IgcXVhbnRlZGFcbmNvcnAgPC0gY29ycHVzKGRmX2RvY19zdG9wLCBkb2NpZF9maWVsZD1cIklEXCIsIHRleHRfZmllbGQ9XCJ3b3JkXCIpXG5cbiMgUGxvdCBhIHdvcmQgY2xvdWQgLS0gSWYgeW91IGRvbid0IGhhdmUgUkNvbG9yQnJld2VyIGluc3RhbGwsIHlvdSBjYW5cbiMgcmVtb3ZlIHRoZSBgY29sb3I9YCBvcHRpb24uXG50ZXh0cGxvdF93b3JkY2xvdWQoZGZtKGNvcnApLCBjb2xvciA9IFJDb2xvckJyZXdlcjo6YnJld2VyLnBhbCgxMCwgXCJSZEJ1XCIpKVxuXG4jRU5EIn0=