• I can only say I’m touched to meet you in person.

  • I have fought many fights on recording.

  • A story almost nobody knows, is how I got fired from teaching at Stanford. We talked about getting fired yesterday when actually I was telling about life, so I did not tell this how did I get fired from Stanford.

  • I was teaching at Stanford and I went for a weekend at a conference TED Global, not TED in Vancouver, but TED Global in Oxford. When I arrived, when I landed, my Stanford email was not working. I called my friend in the department. He said, "Oh." He cannot do anything about it. He talked to the department chair.

  • Then I used Gmail to email the department chair. He said, "You need to take down that video." I said, "What video?" Reid Hoffman, the founder of LinkedIn, and I, we talked after class for maybe 10 minutes. It’s a video of 10 minutes about identity, about data, of course, and so on.

  • I thought maybe because I don’t upload this video myself, I thought the students made a joke, they put some porn in, or...really, I said, "I’m sorry. I just landed in Oxford. I do not know what’s wrong with the video, just tell me." He said he has no time to watch the video. He was just told I need to take it down.

  • I skipped the first session of TED. I watched the video and there was absolutely nothing wrong with it. It turned out that the university thought by having the video up of Reed Hoffman -- who was a classmate of mine at Stanford, I’ve known him since the ’80s -- and me, talking about data and identity that would cut into their revenues, that people would not pay for my class, but they would just watch the class online, and I had to take it down.

  • I said, "No, I’m not taking it down." Then he said, "If you don’t take it down, you will not teach here anymore." I said, "I’m not taking it down." Then, "OK, you’re not teaching anymore."

  • (laughter)

  • I ultimately really believe that the things we create we should be behind, and we should not be willing to get suppressed by people who don’t even have 10 minutes as department chair to get to the truth of something. I’ve never set foot in that department again.

  • I feel very strongly about recording things. United Airplanes wants their airplane turned around and asked me to please leave because I was taking a video.

  • That led to very good relationships. I knew the previous CEO. I know most of the star line CEOs because of that invent.

  • The notion of office hours was actually brought to me by the former United Airlines CEO, because he said, "If you come and do a workshop, great, let’s add a day where you just make one-hour time slots, and people can sign up to talk to you about whatever they want to talk about.

  • Long story short, for me, data, recording things is really what gets us through the truth to accountability. A lot of society, I personally think, whether it is in the US, for sure in mainland China, is better off if we had that accountability.

  • I got a letter from Angela Merkel thanking me for the book. I thought, "Ah, I did not expect that." When I got a call from the German Ministry of Defense, when the German version was out, asking me to go for lunch, and then making me a two-star officer to help the German military in the cyber information space, to help them understand the power of data.

  • I think I’ve touched more than just as chief scientist at Amazon, a billion people. These are organizations which are very important organizations.

  • Angela Merkel in Germany is actually one of the most powerful leaders in the world. That’s why being here, I want to know how can our mindset of transparency help the world.

  • I’ll have to read your book. [laughs]

  • (laughter)

  • I can tell you what’s in it.

  • Thank you. It would not happen with Olfa, not without Lisa, and not without Audrey.

  • How did this happen, anyway? You were visiting here, or...?

  • It’s that Olfa happened this way, that three years ago, a student of mine at Berkeley, Andrew Hall, he had dean of NTU in the School of Engineering invite me for a lecture. I said, my currency, how I get something out of it, is to meet some students.

  • The day before, there were five, six students who came to Montreal. One of them was Olfa. Then Olfa actually hosted here onstage. Then Olfa wanted to, was in Silicon Valley, and asked me whether he can come on a Silicon Valley data safari I was organizing for my students.

  • He said, "What can I do?" I said, "You can give me massage." [laughs] For many years, Olfa has come to my hotel room at 1:00 AM, for example, ready to give me massage, but all we did was talk. [laughs]

  • I came to Taiwan to get the massage from Olfa. You have to come to an island south of Kaohsiung to get the massage. I said, "OK. I go to an island south of Kaohsiung." That’s where we’re going tomorrow.

  • To see his good heart and his dedication to Taiwanese entrepreneurs, to really see the love that Olfa has for his country, and how much he gives, like the kid last night, the physics. We care. I did not know it before yesterday, but I care that he does well, the NTU student who is thinking about dropping out.

  • That’s why I’m here. I told Olfa, I’m happy to spend two weeks in Taiwan to do whatever I can do to help the country from a data perspective. I tell everybody, Olfa is the boss. If he takes me somewhere, that’s why I’m here, long story, but true.

  • It’s a great story, mental massages.

  • (laughter)

  • Yeah, it’s mental massage.

  • The medium is the massage. No, the message is the massage. No, the medium is the message. No. The Canadian communication theorist who I am quoting on the book at the beginning, who said the more data banks, as he called Marshall McLuhan, the more they know us, the less we exist.

  • I think the more databases -- I mean, he did not have this only recorder -- the more they know us, the more we exist, because we are the data we create.

  • Not for all the data.

  • Now, I am doing something with German television. It is trying to understand what the fears are that people have about recordings.

  • What are the fears?

  • My fear, starting with myself, is illegal activities.

  • Drug use, in Taiwan, smoking pot, not that I do it often, but is illegal, or Singapore. I am afraid of government using data not for the people, but data against the people. I have a house in China in Shanghai.

  • We mentioned last night, my friend Josh Chin, who writes for the "Wall Street Journal," who is absolutely great. He wrote a piece two weeks ago how Baidu, Tencent, etc., are the ears and the eyes of the Chinese government.

  • I think for me, the fear I have, it’s not at all about Google, Amazon, and so on and forth. It’s some government -- as my dad was imprisoned in East Germany for many years -- which doesn’t like me for whatever reason, finding data.

  • Now, we assume, finding correct data about me, which out of context, can be interpreted against the law. The second fear I have is that they can manufacture data about me that have nothing to do with the real world.

  • There, I hope, but the hope is .1 percent. There, I hope that if we have data to contradict their data, our chance is much larger to not get put in prison.

  • That’s the only two fears, which are all very manageable, it seems, that you have about data personally?

  • I have two passports, US and Germany. German people are worried about companies. I am not worried about companies knowing the inside of one of them. I personally think the EU’s right to be forgotten by itself is stupid.

  • You also need a right to be remembered. In my book, I am putting out six rights which we as individuals, as citizens, should have towards our data. Some of them do overlap with the EU act, which was not done after I saw the new data protection...

  • Yeah, exactly, GDPR, coming into politics. The book was written before the GDPR came. This is May next year, the Data Protection Act.

  • Yes, digitalizing your insights.

  • It’s surprising, like the right to port data. That’s very, very similar. I just went to New York for a couple of days this month to speak at an event on both the impact of this for American firms or for international firms doing business.

  • You have an app out of Taiwan which can download from the German app store, and the German citizen is using it. What does that mean? I am personally extremely excited to see that there will be massive action for Facebook in May next year, when German people will get organized to all say, "I would like to see my data."

  • The law is that you have to prevent it, and provide it in a format that is understandable, that they can really make sense out of it.

  • It’s not a, "Here’s a dump of a few zeros and ones into a binary file." No, I need to understand Google Latitude, helps me understand my geolocation. I love that. I think that, for me, is empowering people against often what I call the industrial military complex.

  • You are not worried about companies, because you think they will be compliant?

  • What’s the worst a company can do? They can delete my account. It would suck, but I by mistake lost my YouTube channel this year. I have been through that. Not happily, but I survived it. It cost me a few weeks of my life, and nothing is back up yet, but OK.

  • Not much, not much of a problem, what can Amazon do? They can stop shipping things to me. OK, I’ll find other ways. What can Google do? They can stop my email, my Gmail. I’ll go find. They can even delete my entry in the database.

  • If you Google Andreas Weigend, nothing will come up. That would suck, but we will find ways around it. That’s why, compared to the government, where...I was going to take a hike a couple of months ago with my friend, Brad Rubenstein.

  • Brad, a smart, wonderful person, me, both reasonable people, our third friend, just as reasonable as us was barred entry to the United States because some immigration officer looked at his mobile, and found some gay dating app.

  • The sad thing is, that immigration officer thought he was doing the right thing for the country by not letting the person in, so it was only Brad and me who took the hike. We reflected on how citizenship, or the right to travel, let’s say.

  • How some person thinking that he is doing the right thing for the United States of America by not letting a person in based on an app that person has on their phone. This is not fantasy. That was happening.

  • I just told my friend, my accountant, Daniel Bao, who is an amazing person who went to Stanford together with Reid Hoffman. He used to run National Condom Week, so he is a person gets things done.

  • He said, "Andreas, if you give," I think it was like, "10,000 bucks to some GLBTQ organization, then good things will happen." I told him, "Yes, do that before the end of the year," because this has been a tough year in the US.

  • Maybe one more story about data, yesterday, we Googled Putin and Weigend, when you Google this, all the top hits are pictures of Vladimir Putin and me on November 10th last year. That was the day after the election.

  • I was the first US citizen who Putin congratulated to the election. I had no idea that Putin knew more than probably anybody else I knew about what had gone on. It is, for me, a fascinating world about data, about manipulation.

  • Yesterday, I talked about the Facebook wall in parallel to Plato. You know Plato, the allegory of the cave, the wall there, that the world we see is just the world that Google or Facebook, etc., actually want us to see.

  • As I said before, it’s maybe .1 percent difference we can make by opening up data, by embracing transparency, by creating accountability, including for my staff, by the way or me.

  • Interesting. The most companies can do, you said, was the loss of services and goods, essentially?

  • Then you also said that actually, they may cause more, let’s say social impact, on an individual if, instead of loss of goods and services, they fabricate a reality, like the Plato’s cave, that people lives in, and gives people the impression of a synthetic reality, essentially?

  • Take Cambridge Analytica. One of the most exciting conversations I had was with a woman like Lisa at TED this year. I play the cello, and I managed to have the idea that I could go to a rehearsal. I went to a couple of rehearsals this year, one with the Berlin Philharmonic, where my friend plays, and one with San Francisco Symphony.

  • Then that day, when I had the San Francisco rehearsal all signed up, I was asked to be at KQED, which is the public radio station in San Francisco, at noon. I had to leave at intermission, because I was going to do a recording with the East Coast with a woman called Manoush. The recording was about what pictures tell about us.

  • The only reason I did this, that Manoush had done a show before with Cambridge Analytica. That was the best piece I have seen about something I have spent a lot of time reading about. She really was good, so that’s why I decided on that.

  • I was at TED, and there was a random woman who were coming up to me across the hall, and say, "Andreas." "Oh, Manoush," when I read her name tag. They have some people who really...I’m not sure whether you know the show. It’s called Notes to Self. Have you heard about it?

  • Yes, you know it? She is great. Just by exposing what companies know, what companies do, what the government knows, what the government does, I think she is doing a great job as a journalist.

  • Certainly, but I haven’t...The point was, that what the most company can do is that with about the business services. You also said that there is something that the companies can do that are obviously exceeding an impact on an individual beyond the loss of data and goods, which is fabricated reality, and in a sense, a loss of the sense of reality.

  • Wouldn’t you say it’s a more serious thing than the loss of services and goods?

  • I would argue there, and that’s why I mentioned Plato. You could also mention Immanuel Kant, on having these categories of how we perceive things. You could mention quantum mechanics, that we only can observe eigenvalues to the operator.

  • For me, all the same thing, reality, as such, we have no way of knowing. We only know it mediated through the senses.

  • There, I gave a talk at Google earlier this year, which is on the web. At the end, it’s very interesting that they cut that out. I ended by saying Google has enormous responsibility of they shape, they do the ways of world-making.

  • As the Harvard professor said it, they have enormous responsibility to try to allow us to get more, not a way from full six degrees, but maybe two degree of the world, as opposed to .1 percent of a degree of the world.

  • I know Larry quite well, and it’s just pushing them. I’m not worried about Google much. I’m worried about Facebook big time. I know Mark somehow, and that’s a very different game from Google, how they, for instance, collaborate with the Chinese.

  • Whereas Google said, "Well, these are our principles. If you don’t want it, then I’m sorry. We’re not providing the services." whereas Mark says, "Oh, no problem. We’ll find some backdoors for you. We’ll help you out, if you can help me out with the revenue streams from mainland."

  • You think Facebook is much more flexible and less principled?

  • Totally. Everyone knows that.

  • So when they publish according to GDPR everything about one’s data in an understandable and easy to access form, supposedly...

  • Supposedly. Under the laws to come, there’s no question about that, that right now, the EU earlier this year, what was it, four percent of the overall global turnover? I’m looking forward to Facebook actually getting whipped into place by the EU. I genuinely look forward to that.

  • We’re harmonizing with that framework here, too. In APEC, we’re having a GDPR compatible framework...

  • Taiwan is intended to join the CBPR.

  • We’re firmly on the data agency side.

  • It’s the best thing we have.

  • In Taiwan, there is no central data protection agency. Every ministry is responsible for all the different businesses under the auspices of that ministry. There is very little...

  • Principles or translations?

  • Very little common good in which the different ministries...

  • Yes. We’re so far the only country with a modern data protection law without a central data protection agency. Instead, we have 31 or 32 data protection agencies.

  • That’s interesting, because I did a workshop for the FTC in the United States, where we had really a large number of data protection agencies from around the world. I do remember, like the candidates of the Ireland, that guy was good.

  • The Hong Kong guy was a mere puppet. The Korean guy with the Korean agency, the agency I visited afterwards. I actually was thinking that I did not meet the Taiwanese person.

  • Yes, because of the 32 people.

  • Ah, yes. I thought, it’s only that many people I can remember. That’s why I missed that.

  • The 32 people. If you meet 0 person or 32 people, that’s both normal, one person, not so much.

  • (laughter)

  • Japan was like that, but now, they have their DPA as well. That leaves Taiwan. We’re unique, actually. We have a very modern data protection law, but it’s enforced differently ministry from ministry.

  • It also arguably provides a safer thing, because the minister of health and welfare would never work for the minister of economy. [laughs]

  • Last time I was in Taiwan, there was a discussion about the -- I’m not sure I’m using the right word -- national identity card, biometrics, and health insurance, which was a very solid discussion between it all, where I personally...

  • The reason I wrote the book is I believe we need to move society to become more data literate. Part of that is that if you really don’t want your health data in the cloud, you have an accident, and you are unconscious, then there is no way of finding your health data.

  • If you, on the other hand, have your ID card -- and I don’t know what the situation now is in Taiwan -- where Germany, everybody has their health card. America, nobody has a health card.

  • It’s Taiwan, it’s mandatory to...

  • It’s mandatory. That was about two years ago when the discussion. There are pros and cons. What our job is as educators or as government is to make sure it’s data for the people. My older brother is mentally retarded.

  • I’m very glad that he grew up in Germany, where he gets health insurance. He has a place where he can live with other people. What if you can see this from DNA, maybe even before the person is born?

  • What if they find out that, let’s say, I have some disease, then I’m applying for a mortgage, and they say, "30 year? No. No 30 year mortgage for you. 10 years, we can do 10 years." What about health insurance? We have preexisting conditions.

  • I had back surgery 10 years ago. I was very honest when I filled out the application form. I said I had back surgery before and stuff. Had I not been honest, I probably would have been stuck with a $200,000 bill.

  • They would have said, "Ah, you did not indicate that you had back surgery before. We are not paying for that." How can we watch the watchman? How can we make sure that companies, and that’s the role, I think, of government. That’s the role of individuals.

  • I am not worried about Amazon price discrimination and this. Poor Jeff Bezos had to go and testify about this. This is not what I’m worried about. I’m worried about not getting insurance at a reasonable price.

  • Now, the fascinating thing is, we want companies -- say, banks -- to believe that fairness is that if somebody is a higher risk of paying back something, that we charge them a higher premium. I had lunch with the son of CP Group in Bangkok, who just allowed us shareholder in Ping An.

  • We want to actually be fair, that if somebody has likely to have a problem of paying that mortgage or something, or that loan back, then it’s the right thing, also for that person, to not get that loan, the same to questions before.

  • One, if it’s used against the people, like for my brother, or two, what if the data is wrong? It’s influenced on my dad having been in prison. It is what I worry about, data for the people.

  • Why for the people? Why not with the people?

  • The for comes from the US, data of the people, by the people, for the people, data with the people.

  • Isn’t that what’s it about? You won’t use generated data, sousveillance.

  • Sousveillance, by the way, have you met the guy who invented sousveillance?

  • I haven’t, but I read a lot.

  • Yeah, Mr. Mann. He’s absolutely awesome. Steve Mann, it’s one of the good things, if you write a book, that random people who we are talking about in the book who I didn’t meet say, "Hi, I’m Steve Mann. When do you have time to get together?"

  • We got together at the Stanford pool, because he likes to go to the pool. [laughs] That is brilliant. I translated the word into German. There’s Überwachung and Unterwachung. I’m very proud of that term.

  • It’s a great translation.

  • It’s a great neutral term, once you get the term. Unterwachung, that you take pictures of the people. That will be stories about the German police showing up at my brother’s place, because I took pictures of what they thought was inappropriate. I didn’t think so.

  • Or my friend who runs for the Singapore police big data, Dan Ong, another Stanford student whose grandfather -- he usually don’t tell you that -- his grandfather used to own the red light district...Then now, I hope I am not...

  • I don’t know about people’s religious beliefs. His father became very Christian to rebel against his grandfather. He doesn’t even call his grandfather his grandfather. He’s just, "Oh, my father’s father," to be clear that that is not values we subscribe to.

  • He works for the Singapore police. When he was a kid, he was flipping noodles, finding out what’s happening in the red light district. I was in charge of the data, the big data project of Singapore police.

  • These are interesting problems that an officer, you know, Andreas, he told me, sometimes when we are in action, we might be running so fast that the thing gets disconnected. Then what can you do? We need to protect our country.

  • We can’t worry about our body cam when we are really in action. If it falls off, what can we do about this?

  • Yeah, what can you do?

  • What can you do? I’m not sure whether you know about that US case which I discussed in the book, where some court told a police officer whose body cam fell off, and his intercom wasn’t working, they said, "Sorry, but despite being under oath, we don’t believe what you said."

  • The body cam falling off and the intercom not working, against one statement of you, the judge said, "We believe the other party."

  • That’s where the accountability comes in. If you are not accountable, you’re not part of the evidence.

  • It’s, "I am sorry my camera fell off, and my intercom wasn’t working," and, "We are sorry for you, too, but maybe it’s time to look for another job. If you can’t keep your intercom on, I don’t think it’s the right job for you." [laughs]

  • Yeah, but data from the people, too, so why not with the people?

  • Thank you for coming back to the topic. I do like the for the people, in contrast to against the people. For me, that title, when I came up with title, I knew that’s the title. The German book has the same title. It’s just, there is no way of translating this into German.

  • Olfa and I talked a lot about the Chinese title. I forgot what they finally picked in mainland.

  • It should be like, people data for the people.

  • The discussions we had with a couple of Taiwanese and a couple of mainland friends, they were so revealing. Mainland, the Taiwanese people, "Oh, no, that sounds like propaganda."

  • (laughter)

  • It’s a bit tricky for the people. I don’t know Chinese, but I realized that really beautiful discussion.

  • Because they use German, the term is, in Germany, very, very political, because they have a lot of...

  • What was the full title, again?

  • 人民的資料為人民. So there is two 人民, it’s just a political title for the book.

  • It’s going to published here?

  • No, actually, it came out in China, as a matter of fact, before it came out in the US. It’s an awful translation, I know it. I told the publisher in China, we need retranslation I would also like to have a traditional Chinese translation. I don’t know how to go about it, but it cannot be that expensive to have it.

  • Once you have one variant, it’s pretty cheap to have the other.

  • Yeah, but not just mechanically, but to have a person who actually thinks take the...I have the Chinese Word file, and then the main Chinese file. Then to actually fix it, so it’s a good, traditional translation for Taiwan and Hong Kong, as opposed to the mainland one.

  • Many people, smart friends of mine, told me, "Andreas, this is not good translation." The German one is extremely good. I was in tears when I first saw the translation. I met with a translator, and I told him, "How can I thank you?" He said, "You cannot."

  • (laughter)

  • "I loved this book. What you see is the love I put into translating it." Then I said, "I saw that, and I can’t really say thank you." It’s in many languages by now, but the Chinese one is bad. They sold 10,000 copies in the first, so I’m not talking about not selling it.

  • I’m talking about I’m sad that it’s such a bad translation, that people don’t really understand what it’s about. I have received some remarks. I’m clear that there is censoring on. That’s not what I’m worried about. I’m worried about them just not caring, as opposed to caring.

  • Because it’s a mobilizing tool, isn’t it?

  • I think it’s not for political reasons. The publisher there are some parts which I was very aware of, that they are not going to translate those. That’s not...

  • (laughter)

  • I have no problem with that.

  • You have no problem?

  • Zero. That’s just the way...

  • It’s just data loss. [laughs]

  • What I have a problem with is that on other things, they just simply didn’t think about things mean, and they just said something. Google Translate would have been just as good in some parts, my friends tell me.

  • Back to Taiwan, the book will find a way to find somebody who is interested in translating and publishing it. They always say this. I said yesterday, I was inviting to give a talk to the UN maybe five years ago, to the general assembly on data is the new oil, data is a resource.

  • It’s just so sad for me to see that one of the democracies, with all the struggles of being a democracy, that it doesn’t have a seat in the United Nations -- which has nothing to do with Taiwan -- for purely geopolitical reasons. It’s a sad story.

  • It’s OK. We have an avatar now.

  • We send a robot avatar.

  • (laughter)

  • What do you think a person who is very passionate about data, about people and data, not about chip data, what Olfa studies, but just about the data we create. I’m very passionate about people. I’m very passionate about what we can learn about the interaction between people.

  • I’m very passionate about the balance of power between individuals, government, and companies. How can I help?

  • As a matter of fact, we are designing a DPA right now, using a multi-stakeholder setting, one of the stakeholders being Lisa.

  • It’s taken the form of rolling surveys, but also, we’re also thinking about getting recommendations from individual experts. So far, when I say we, we mean the V-Taiwan community has talked with the Taiwan Association of Human Right, as well as people who are more interested from the ecommerce level, of course.

  • Like, they are already being impacted by the GDPR. They will want to know, if Taiwan have a DPA, what kind of impact it will have on them. Currently, a certain company under the purview of a certain ministry already knows pretty well how that ministry operates in its role of ministerial DPA.

  • If there is going to be a cross-ministry DPA, then that would change things, because the rules will be different. They are also stakeholders. They also have a stake in this.

  • For example, the EID is the ministry of interior. Health card is ministry of health and welfare. There’s actually no way that they would be much into the same card, just based on the reason of data protection and learning.

  • The interior one is where they send people, or where the judge will go to sign? [laughs]

  • It’s the ID card. It’s also our national PKI, so that anyone has this card, which is pretty advanced. It’s a combi card. It’s touched-based and so on.

  • It’s absolutely separate from the health card, but both are mandatory.

  • Where is the link? The link is the ID, the primary key?

  • Like Germany, there is no link between your health care?

  • Right. There is a health card serial number, of course. Inside of the health card, of course, there is your national ID. The national ID number, Taiwan has a national ID number. That can be joined, of course, with the ID database, but they never do that. It’s totally different data pipelines.

  • How can you make sure that they don’t do that?

  • Because they are under the auspices of different ministries, like physically different networks, and a very strict data protection law against use of unauthorized purpose.

  • Same would be in Germany. I think if a ministry, let’s say the health minister or...The interior minister is different. I was on a German TV show with the then-minister of interior of Bavaria, which is a very right-wing country, who then became the minister of interior of the country afterwards.

  • He was on one side. On the other side, I had the founder of the Green Party, who was put in prison, like, seven times by the minister of interior for launching it.

  • (laughter)

  • It’s also the minister for prisons. [laughs]

  • It was my charm which tried to move on the action. She said, "Oh, what an asshole. What a bitch." [laughs] Let’s go back to the question of data.

  • That’s right. The cross-ministerial purview of the DPA is what we are having a general consultation and designing at the moment. The other thing that people often talk here in Taiwan is, as opposed to open data, there is a movement called Open Algorithms, started from Sandy and friends.

  • Sandy, the MIT Media Lab person.

  • Yes, but he is OK. There are other people who actually are less PR people, who think more deeply about algorithms. At TED this year in Vancouver, I had the honor of giving a short speech. What I did was, at TED you have to been thinking about the audience.

  • I took a shot glass, and then spit into it, and asked what’s the value of the data? What’s the value of this? This is a very tricky question. The data question, I think I have really thought through. The algorithm question, because an algorithm without data means nothing.

  • In the book, I am discussing, for instance, of course, Amazon recommendations. This is more than a two-minute remark, because the algorithm, you can distinguish between the parameters of the algorithm and the algorithm algorithm, if you will.

  • I did my PhD in neural networks at Stanford in ’91, which was a lot time ago. For me, this is a world I really, genuinely have lived in for the last 25 years. Knowing what the trading data is, if you only, as an example, you’re much more likely to frisk African American people for drugs, then guess what?

  • You’ll find a training set where wow, it’s amazing that most people have drugs, who have found positions...illegal drugs are African-American. That’s because you have societal biases for these people.

  • The algorithms take much more discussion of understanding what of it is truly the algorithm design. What is the data we use to train the machine learning model? What really are the parameters, and how can we understand the arguments? This is very hard.

  • Right. I positioned this next to open data because, for example, the Minister of Health and Welfare, because of monetary insurance, has a lot of data about everybody’s health and insurance, and how we’re using it, basically, a clinical record of everybody.

  • Now of course it’s for the purpose of administrative health and welfare. It’s never used outside of the purpose of health and welfare. There’s bound to be people for social good I’m sure who want to do analysis of the data outside of the purpose of health and welfare.

  • Now there’s two schools of thought. One said that we should run a differential privacy or pay an amenity, whatever, to make the algorithms...I call it privacy laws as pollution. Like if you have a plant burning coal, burning gas, whatever, you want to put the environmental control so that it pollutes under a certain threshold like epsilon in differential privacy.

  • I talk about in the book in a section which is complicated...Of course it’s not easy to make it really simple. I talk about erosion, the necessary. I do draw the parallel to EPA that as you burn things, yes, there are side effects.

  • Exactly, there’s different chemicals when combined more toxic.

  • That part of the book I think is complicated part which no journalist ever talk to me about because I think it’s simply not possible to really say this in very simple terms. Or at least I can’t. I can’t really do it, the cost of using the data.

  • But there are other things. I think this is for me is a second order effect. Differential privacy is great. Synthetic data is great.

  • The notion that you can learn, going back to what you said, that there are people who want to learn about drugs, I mean medical drugs, and so on. There is a lot to be said to give them access to the data.

  • How can we prevent...I think absolutely critical is the answer to that differential privacy that in principle, you can’t use it in against the person. My favorite example, which I’m sure you know, is if you want to learn about let’s say prostitution or drug use or whatever things are illegal...Is prostitution illegal in Taiwan?

  • It’s legal but only if the local government has allowed it. So far, there’s only five different instances of brothels. It’s very legal on a very limited circumstance.

  • In Germany, it’s simply legal. Germany’s is not an issue out there. Other countries, it’s an issue, mainland China, although you walk in the streets Shanghai, every [non-English speech] is everywhere. What I hate is hypocrisy. One thing I love doing with data is busting hypocrisy.

  • The example I always give is if you want to know something about a public issue, I would give you coin two of that and say, "Here you go. Flip that coin. If it comes up with a number, you have to say the truth. If it comes up with a face..."

  • "...you can say whatever you want to say."

  • (laughter)

  • If it comes up with a face, you have to say yes. If it comes up with a number, you have to say the truth. That’s how it works. Number truth, face, you say yes, now over flips.

  • We’re not supposed to know in Europe.

  • What comes up is number. As you said, nobody knows what the outcome was. You have to flip it in private. Then if it says one thing, you have to say the truth. If it’s other thing, you have to say, "Yes, I saw a prostitute this year" or something. [laughs]

  • Now, the fact that you said you saw a prostitute, well I don’t know whether it is you had to say you saw a prostitute or because you had to say the truth.

  • I love this that the individual is protected, yet we can learn about the ensemble. We can learn about the group. I love such things.

  • If Olfa flips the coin again, we have less post-processing to do.

  • (laughter)

  • It will be statistically significant. [laughs]

  • Or if somebody observes what came out Olfa.

  • I mean if Olfa flips the heads. He flips again and says he is of heads. No if numbers. It’s less complicated than...

  • I don’t know that one. I know the original one. I don’t know that line that you flip again. I can’t think how it works, but I know that if you say, "Nobody knows where they say yes, because..."

  • It’s technically the same. It’s just if you were calculating averages and so on. It lessens the burden of post-processing.

  • Audrey is good, better than me.

  • (laughter)

  • Google does that, for telemetry. Firefox does that now, too.

  • Absolutely, there are many examples beyond the sketchy one about...

  • That’s the one school of thought. It’s essentially data anonymization.

  • The virtual world is not anonymization, I think.

  • So many people get fooled by thinking they can anonymize data. Let me tell you a story. One of the dearest friends I have is Dave Liu, Taiwanese who was at Stanford. He worked for me. It’s the one I met at the New Year’s party. I had another from Kaohsiung , whose roommate Dave was, and the Kaohsiung guy always went out of the room, coming to cook.

  • I said, "What do you do? Do you cook?" "No." I called my roommate, I said I want to meet your roommate. I met Dave. Dave worked for me and then Dave went to Amazon, and then Dave convinced Jeff Bezos to hire me, so I know Dave well.

  • (background conversations)

  • Now I’ve forgot my story. [laughs]

  • I’m sure it’s good. It will come back to you. [laughs]

  • It’s simply, in trying to link something to the story.

  • Data anonymization?

  • Dave has a brother, Peter, and when Peter was 16, they lived in Southern California, Peter brought home an African American girlfriend. His parents had an immune reaction, and they said it’s either girlfriend or it’s your parents, and Peter said, "It’s the girlfriend," and left home.

  • Then when I was at Amazon, my dad passed away, so I called Dave Christmas Day, and we talked. He was in Seattle. I was in Germany. Dave addressed something, "I found my brother Peter through Amazon database," [laughs] "And he came to my parents today, as a Christmas present after 20 years." pretty moving, so much for anonymity. [laughs]

  • Anonymity, that’s right. Anonymization doesn’t mean 100 percent anonymization, but anyway, going that way.

  • Exactly to your point, Sandy was advocating something very different. He still is advocating that the data never gets this treatment post-processing, differential privacy, whatever, algorithms should.

  • Researchers publish the algorithms. The algorithms wants peer reviewed to cause minimizing in privacy laws, statistically speaking, whatever. The data on there runs the algorithms and publishes only the statistics. Essentially, the algorithm only becomes only another statistical method. Then the data never leaves the hand of data owner, let alone post-processing or synthetic data, or whatever.

  • Just like people would want the data to be radically usable, one has to find the algorithm that doesn’t strike a compromise between usability and privacy laws. Then it’s peer reviewed, so every data owner gets to run it simultaneously. That makes it scientific, according to Sandy.

  • I am not sure whether I think this is just an empty story, or whether that actually has merit. I just don’t know.

  • They did the research. What I know is that it...

  • Sandy has done many...

  • This case, if it wasn’t Sandy, I would be less critical, a priori.

  • Far as I know, it’s legal under GDPR, his idea, it’s the only thing I know. Algorithmically, whether it makes any sense, I don’t have any ideas.

  • Good question, good question.

  • I know it’s legal.

  • How do we audit algorithms? I absolutely, first of all, yesterday we talked about what was, of course, my best experience at Amazon, what was my worst experience? I said my worst experience was that I did not manage to convince Jeff Bezos to share data with the academic community.

  • I tried for a long time, and I always was biting the granite. In retrospect, he was right, because you never know. We have seen many examples since. We never know what we can find out about individuals, which we genuinely didn’t expect when we shared data. Jeff was right.

  • Auditing an algorithm is day one. We do not know how to do it. Let me give you an example. After my PhD, I went to Thailand for half a year. I taught at Chulalongkorn. I was just back last month, and I gave a famous lecture at the university. I gave as an example...

  • (background conversation)

  • ...that the lengths in Thai characters of a name, there’s a good correlation with whether the person’s ethnically Chinese or ethnically Thai. Something as innocent as possible, counting the length of Thai characters in a name, already is a good indication for people’s ethnic.

  • Thailand is relatively safe. If you take Malaysia, and I don’t know much about Indonesia, but I was in Malaysia Axiata in the board meeting. I’m usually not that bad and faux pas. I toured over two faux pas, that I talked about pork and I talked about alcohol.

  • That clearly was not what I should have done in a board meeting for an hour, and it was in a country. If you take this simple example of the length of characters, which is obvious to somebody who knows Thailand, and transfer it, I don’t know how people in Indonesia or in Malaysia do last names, but probably some were similar, that you can find out, is this true?

  • How an innocent variable like the length of characters can determine something which in Hitler Germany would have had people go to concentration camps, this is very difficult when you analyze the algorithm.

  • Even if it just posts produced statistics, like a list of parameter of lengths of last names?

  • Absolutely. At Amazon, one thing I did was -- and US actually know about Taiwan -- in the US, we have Mother’s Day. Do you have that here?

  • If person A sends present to person B, where the first name is female...And for that, we ran innocently to the US Census Bureau. They have first names, and they have gender. There’s nothing in this thing I did which is intended to be in any way sketchy.

  • It was just finding out what is the probability of somebody being the mother of somebody else? One is they have a female first name. Two, they get a present for Mother’s Day. Then you think about transgender people, and you realize that you build biases into this algorithm which you shouldn’t have built in, which you just didn’t think about the time.

  • There’s so many examples where after the fact, you feel bad that you didn’t think about it. I worked with very good people, and I can genuinely say that none of us intended such discrimination. It just fell out at the end, when we saw results.

  • I think nobody who doesn’t work at Amazon in this case, who spends the time we have, can look at other data, if it has a chance to find such properties of the algorithm, because it the algorithm and the data combined.

  • You can’t get the data. As we said before, one thing, Jeff Bezos was right, you can’t just put the data out with the algorithm, because that way, you would violate the privacy policies. It’s a very, very intrinsically hard problem, where I don’t see a solution besides trying to tell people who work at these companies to please, as Google says, try to do the right thing. Try to not do evil.

  • What about including more stakeholders into that audit, like transgender people?

  • Yeah, but I worry that even being very willing, and genuinely wanting to see things, that it is just very difficult to see. I think when it comes to auditing algorithms, I don’t think we have found...

  • A practical way for non-trivial algorithms.

  • ...a solution yet. I’ve read many papers. I’m reviewing some papers in the book about this. It’s been on my radar for years, since Amazon actually, that I’ve seen many stupid things about auditing algorithms.

  • If very willing people want to do the right thing, I don’t think people really know what to do. Internally, I think there is a chance, if you really have a smart engineer, being paid to try to figure out what the biases are.

  • That’s what black hack security figures do.

  • (laughter)

  • White hat, too, for that matter.

  • Externally, that’s why need people like Olfa, who used to work for some company which tried to monitor what the mainland is doing...

  • ...in terms of computer issues, let’s say.

  • (laughter)

  • We need to put Olfa on such topics.

  • Of course it’s an open problem. It’s good that you’re highlighting the important things we need to consider, instead of just...

  • If we do 10 percent, it’s better than 0 percent. If we do 20, 50, 90 percent, I’m happy. The fact that we can’t do 100 percent, which I’m convinced about, should not stop us for the first 50.

  • That’s a very good...I agree, too. That must, of course, inform your standpoint about a so-called interpretable artificial intelligence. which is actually a subclass of this problem.

  • Having done neural networks, and having written my thesis on, precisely, trying to interpret what hidden layers are doing, I have not thought about it, recently. The general black box argument of economists saying, "Oh, we would never trade anything which we don’t understand," and then saying, "Well, what does it really mean to understand something?"

  • Then they say, "Well, that means we are doing linear regression," and, "Why is this a deeper understanding than doing sigmoids?", that kind of logistic regression.

  • You quickly realize that they subscribe to some paradigm economics and others subscribe to other paradigms. That case, I believe incentive alignment ultimately solves problems. What we have seen with IT computer technology in communication, or whatever you want to call it is, that it has removed many barriers. It has removed much sources of friction.

  • The beauty is what remains is basically the incentives. I love companies making this software, like the CEO of Matlab whom I met in Tokyo. He wants to do science for three things. Among those three things is people analytics. I don’t know much about Taiwanese culture. The fact there were in the entire room only men, out of his top managers.

  • That’s an unlikely, if you assume poisson distribution, it’s an unlikely distribution to have the room solely with men. If he genuinely is willing to look about who does he promote, and I’m not saying he is or isn’t, but if the data make the implicit explicit, I think it’s a very good first step to change behavior.

  • We all have our biases. To be super clear, every now and then, I meet a person who I genuinely like, like my friend Min-Yao, who I really, Min-Yao is a Singaporean kid, simply does not have a racial biases, period. Not many people genuinely convince me of this. The strength of Min-Yao is that he does not see that.

  • That’s great. Most people do, and by making it explicit for them, no need for complicated algorithms, just doing the Instagram, just look in the room, I think goes a long way.

  • Matter of fact, gender balanced representation is constitutionally protected in Taiwan. We are required to have a certain ratio of representatives in the Parliament and any level of counsels. That was a pretty advanced constitution when it was written.

  • I was actually super happy with the gender situation in this year, when the outcome of the election. I’m very happy to be in the country of Germany, where we have a woman, who we’ve had a woman for many years. [laughs] That’s the good examples.

  • That’s the alignment that you’re looking for.

  • Yes, the incentive. The alignment, this is more like getting the incentives clear.

  • Sometimes we need to...at Amazon we always said, "To shine a bright light onto something, and then things will change." if you really shine a bright light onto things...

  • Last night, you talked about...one of the gripes I have with universities, UC Berkeley, where I teach. Yes, they do, every class has an official survey where the students evaluate the professor. I want that data to be public. I want students to base their decisions on data, and not on ratemyprofessors.com, where you have one percent of the students writing in.

  • No, I want them to base their decision on those 15 minutes every professor uses in every given semester to have the students heard from, but no. I was told by the dean, "No, no, we can’t allow them to slice and dice this data, because even you, you will be able to find out what your colleague’s rating is."

  • What’s wrong with that?

  • Yeah, what’s wrong with that?

  • (laughter)

  • No, that would infringe on their privacy.

  • I said, "At Stanford, you have sites where you can see class size, level of the class, and then, yes, you’re down to two samples as of two. You know, one of them is you and the other one is your colleague." "No, I don’t think we can do this."

  • On the one hand, they publish what they pay me, which I have no problem with, very little. On the other hand, the output of what I produce, and of course, I do a good job of teaching, that’s why I’m interested in this. It would be the established professors who don’t give a fuck about the classes and what they’re teaching, they might look bad.

  • Or they may make it clear, that they don’t really give a damn about it?

  • That’s the first step.