[Withheld from all public copies]
What could go wrong if the Uber data wasn’t anonymized?
Both Allman & Paxson, and Partridge warn against relying on the anonymisation of data since deanonymisation techniques are often surprisingly powerful. Robust anonymisation of data is difficult, particularly when it has high dimensionality, as the anonymisation is likely to lead to an unacceptable level of data loss [3]. – TPHCB 2017
Also, note the existence of the PDPA law in Singapore
What risks does this pose? Consider contexts outside Singapore as well.
By iterating repeatedly, the generative network can find a strategy that can generally circumvent the discriminitive network
“The collection, or use, of a dataset of illicit origin to support research can be advantageous. For example, legitimate access to data may not be possible, or the reuse of data of illicit origin is likely to require fewer resources than collecting data again from scratch. In addition, the sharing and reuse of existing datasets aids reproducibility, an important scientific goal. The disadvantage is that ethical and legal questions may arise as a result of the use of such data” (source)
For experiments, see The Belmont Report; for electronic data, see The Menlo Report
Older methods
words | f_animal | f_people | f_location |
---|---|---|---|
dog | 0.5 | 0.3 | -0.3 |
cat | 0.5 | 0.1 | -0.3 |
Bill | 0.1 | 0.9 | -0.4 |
turkey | 0.5 | -0.2 | -0.3 |
Turkey | -0.5 | 0.1 | 0.7 |
Singapore | -0.5 | 0.1 | 0.8 |
Infer a word’s meaning from the words around it
Refered to as CBOW (continuous bag of words)
Infer a word’s meaning by generating words around it
Refered to as the Skip-gram model
Source article (colah.github.io)
Samsung Electronics Co., suffering a handset sales slide, revealed a foldable-screen smartphone that folds like a book and opens up to tablet size. Ah, horror? I play Thee to her alone;
And when we have withdrom him, good all.
Come, go with no less through.
Enter Don Pedres. A flourish and my money. I will tarry. Well, you do!
LADY CAPULET.
Farewell; and you are
What creative uses for the techniques discussed today do you expect to see become reality in accounting in the next 3-5 years?
Today, we:
seed_txt = 'Looks it not like the king? Verily, we must go! ' # Original code
seed_txt = 'SCENE I. Elsinore. A platform before the Castle.\n\n Enter Francisco and Barnardo, two sentinels.\n\nBARNARDO.\nWho’s there?\n\nFRANCISCO.\nNay, answer me. Stand and unfold yourself.\n\nBARNARDO.\nLong live the King!\n\n' # Hamlet
seed_txt = 'Samsung Electronics Co., suffering a handset sales slide, revealed a foldable-screen smartphone that folds like a book and opens up to tablet size.' # WSJ article
# From: https://www.wsj.com/articles/samsung-unveils-foldable-screen-smartphone-1541632221