Friday, March 25, 2016

Navigating a sea of information, v3

There is a ton of information available today. With over 130 million books and a billion web sites officially registered, there's a lot to read. Add to that huge amount of information the user-generated content like the over 5 million Wikipedia articles (for just English), or the 300+ hours of video added to YouTube every minute, or the over 4,000,000 blog posts/day, or the incredible amount of information shared and published and consumed every minute (see below, or here for a great infographic from 2013), and it's obvious that without tools to navigate this sea of information, it will be easy to get lost, be misinformed, or to never find what you needed in the first place.
60 seconds of internet: Some numbers
You'll want to click that. It shows you then numbers (which are staggering), but it also gives a little update from the CEO of the company that produced it.

I recognize that this is a long post, so I made a 3 minute video summary of it to help orient you. I feel these ideas and topics are vitally important, and so I want to make this as digestible as possible while I'm exploring the topics. When I post my final version, it will be even more digestible, so please bear with me.
The question, then, is how do we make use of all this information? Is it impossible to make sense of all this "big data", or can we harness it to harvest useful information?

The problem: how do you know what info to trust?

Since we can't know everything all the time, we must rely on curated sources of information to help us digest what's important to us. This includes authority figures like professors and experts yielding academic journal and reports, information aggregation services like news outlets, regular journals (magazines, for you Americans), newspapers, radio, etc, and other people who may or may not know what they're talking about, consumed through things like blogs, and social media outlets.

Because we are basing our understanding, opinions, beliefs, and, ultimately, our reasoning for our actions on this information, it's vital that we are able to know how trustworthy that source is.

I don't think it difficult to persuade you that social media is very often not an accurate source of information. I also don't think it's difficult to see that major news outlets don't exactly weigh "accuracy of information" as the most important factor in stories. It's bad enough that there is now a service that occasionally fact-checks major news outlets, ranking the truthfulness of their information. And a study by Fairleigh Dickinson University, re-published by the Business Insider and others, shows us that even when much of the information is factual, it doesn't seem to lead to improved understanding of domestic or international events--which is often seen as a primary purpose of news outlets (click the link or image to read more).
Generally, there are 4 major problems that must be overcome to make information useful.
  1. Access. You need to be able to access the information to make any use of it. This seems obvious, but a lot of meaningful, useful information is inaccessible, either because of reader illiteracy (which can mean not being able to read it, or not understanding the topic well enough to make accurate sense of the information), or because the information is controlled. An common example of information control is the paywall, where a reader must pay to access the information (Netflix, some journal publications). This isn't necessarily bad, since these fees can help keep the content provider afloat, and allows them to curate better information. The problem is that they can also prevent access to those who need access to that information, such as an important medical study being prohibitively expensive to doctors in developing nations.
  2. Authority. It's difficult to know whether a piece of content is trust-worthy or not. Did they consider all sides of the issue? Do they understand the core principles associated with what they are presenting? For example, Markham Nolan describes a situation where a video is posted, accusing a government of throwing bodies off a bridge to hide their murders. As reporters, they must now figure out if this video is authentic, and represents a true claim, or not. He describes how his team worked through authenticating it.
  3. Bias. It's nearly impossible to present pure information without any bias, or consume information without interpreting it through preexisting biases. A study, published in Business Insider, showed how our own bias changes our perception of a source's trustworthiness (click the link or image below to read more). Additionally, people are prone to confirmation bias (more on that here), or may lock into a single explanation or understanding of an issue, and not want to change that understanding. Being aware of content providers' biases and motivations is important for understanding the content better.
  4. Trustability of news outlets
  5. Level of presentation. This refers to the vocabulary and amount of technical jargon used in work, and the overall ease of understanding the work. For articles, you can have very technical papers that use a lot of jargon or special words, usually designed for telling others in that field about the research, or works that are just hard to read, though they could be written much more simply. You can also have the opposite problem, where something is so simplified that it really doesn't help the media consumer understand the topic. Media should support the audience it's intended for, and people outside that intended audience should do their homework to understand it. That's not to say that more effort to make "doing the homework" easier would be amiss.

Proposed solutions: some work better than others

I hope it's obvious to you that there's a problem. Several solutions against being misled have been offered. A typical example comes from Howard Rheingold's book Net Smart, where he presents three steps to verifying information.
  1. "Triangulate," meaning you find three different, unrelated sources to verify any piece of information, because if all three say the same thing, it's much more likely to be true
  2. Look at the stakes the author/creator has. If the person is blogging for free, they are likely to have a lower stake than somebody who's job depends on the accuracy of the information. Often, the one with more to lose will be more reliable.
  3. Use your inner "crap detector." If something seems fishy or off, it likely is. This comes from experience, so isn't terribly helpful if you're new to some topic.
The problem with this approach is that companies and special interest groups know this is what you do. There's a lady who works in media, named Sharyl Attkisson, who gave a TED talk discussing how groups use this approach to create multiple, credible-looking sources of information, but which are not accurate or true. In fact, if you think we can verify information using Howard's three steps, you need to watch her talk:
She suggesting looking for tell-tale signs of planting this information, including:
  • Use of inflammatory language, like prank, crack, psudo, conspiracy, lies
  • Claims to debunk myths that aren't actual myths. This can propagate, fooling some, and convincing those who disprove that myth that they're too smart to fall for it, even though the myth and it's debunking are fabricated.
  • Attacks on people or organizations rather than on the ideas or facts
  • Public skepticism directed at those who question their claim
Additional ideas generated in discussions with friends and family is to leverage subject enthusiasts, as they often can have experience to match experts and tend to be more helpful to newcomers. You can often find them in forums dedicated to the topic in question. We also discussed the usefulness of seeking divine revelation about a topic, as they believe in a God of truth, including truths on social, scientific, and other issues. We also discussed the need to understand the "basic principles" or core principles of a topic, which could be underlying physics for natural phenomena like climate change, or understanding specific processes, like how voting districts and delegates logistically work for understanding elections in the United States.

Another Proposed Solution

While this can help detect problematic information, it doesn't solve the problems of understanding how to integrate new ideas or harnessing the massive amount of data online to sort out what you need. A true solution needs to:
  1. Be able to make use of collaborative effort (use the volume of info available, tap into multiple levels of understanding and willingness to help others understand)
  2. Be open to those who need it and have services that cost money so the solution can sustain itself and improve it's offerings
  3. Provide verification of content providers' authority on the topics
  4. Clearly identify sponsors, supporters, and ambitions for the work presented
  5. Offer multiple levels of presentation
I have one proposed solution to organizing a lot of information, helping to spread reliable, researched information (see separate blog post about that), but that won't fully solve the problem--the general public still needs to be educated on the dangers of information aggregation and be offered tools for verifying information, and this education needs to be simple to understand and follow. I am open to ideas on this front, and I welcome any comments on any aspect of the above. This post has evolved, with the previous post on the topic being a summary of the discussion on previous versions. Version 2 and Version 1 of this post can be found at the links in this sentence.

