There is no Wisdom in Crowds

@ebrown p,

I don't know about any quantitative research; Google, specifically, is usually rather tight-lipped about their algorithms, launches stuff just in as a beta and pulls it again later...

But Butrflynet's link here

http://www.ted.com/index.php/talks/blaise_aguera_y_arcas_demos_photosynth.html

really shows the power of this idea. Watch the whole video - it starts out rather interesting, but you'll see the far reaching implications towards the end of it (thanks for posting that link!).

1 Reply

@ebrown p,

The TED video I posted earlier goes into that aspect of tagging/labeling images.

0 Replies

@ebrown p,

Quote:

Has anyone seen any quantitative research on the effectiveness of methods such as this?

Yes, but I don't remember a link off the top of my head. Luis von Ahn's research (he works for Carnegie Mellon University) is a great starting point. He invented the recaptcha and image tagging systems.

The concept behind recaptcha is actually much more interesting than the image search, but really the same principle. Carnegie Mellon was asked to help stop Yahoo's spammers in 2000 and then Luis von Ahn began to regret how much time humans wasted filling out his captcha tests. Then he wanted to harness the collective work people were doing to digitize old books. It works well, and launched with 99.1% accuracy despite the deficiencies of the crowd (foreign users who don't speak the language, bad spelling etc).

1 Reply

@old europe,

Here are some more examples of the Photosynth concept. These are some collections that have been put together.

http://labs.live.com/photosynth/collectionHome.htm

http://en.wikipedia.org/wiki/Photosynth

This Wikipedia article goes deeper into describing the process:

Quote:

The Photosynth technology works in two steps. The first step involves the analysis of multiple photographs taken of the same area. Each photograph is processed using a Scale-invariant feature transform algorithm, which identifies specific features, for example the corner of a window frame or a door handle. Features in one photograph are then compared to and matched with the same features in the other photographs. Thus photographs of the same areas are identified. By analyzing the position of matching features within each photograph, the program can identify which photographs belong on which side of others. By analyzing subtle differences in the relationships between the features (angle, distance, etc.), the program identifies the 3D position of each feature, as well as the position and angle at which each photograph was taken. This process is known scientifically as photogrammetry, with similar products available such as Imodeller, D-Sculptor, and Rhinoceros. This first step is extremely computationally intensive, but only has to be performed once on each set of photographs.

The second step involves the intelligent display of and navigation through the 3D point cloud of features identified in the first step. This is done with the publicly downloadable Photosynth viewer. The viewer resides on a client computer and maintains a connection to a server that stores the original photographs. It enables a user to, among other things, see any of the photographs from their original vantage point. It incorporates technology Microsoft obtained through its acquisition of Seadragon in January 2006. The Seadragon technology enables smooth zooming into the high-resolution photgraphs without downloading them to the user's machine.

1 Reply

@Robert Gentel,

Bah goes the sheep

1 Reply

@blindsided,

Fart goes the brain.

0 Replies

@Butrflynet,

Butterfly, I read up on Photosynth (including the links you provide)... and although this is a cool application, it doesn't rely on the "wisdom of crowds". I don't see how this is applicable here.

2 Replies

@ebrown p,

Did you watch the video, ebrown? The application is just the framework, for sure... The interesting thing is the possibility to harvest user data - to read tags and semantic information millions of users have attached to their photos...

0 Replies

@Robert Gentel,

Robert, I don't think the recaptcha thing applies as "wisdom of a crowd" either.

In this case, work is being farmed out to individuals by a central system. I am not skeptical about this technology at all.

Incidently, this same technique is being used by spammers to defeat captcha systems. They set up their own captcha thingy (on a page with desireable content). But, they trick the users by simply echoing the captcha from the target site (e.g. Yahoo)... so their users think they are getting access to one site... but really they are giving automated spambots access to Yahoo accounts.

In any rate...

A system to harness the work of many individuals (as in the example you give) does not make me skeptical at all.

It is the scheme that pretend there is some combined knowledge of a group that is greater than the individuals that make it up that I suspect of being empty hype.

3 Replies

@ebrown p,

To me, it's the idea that crowd moderation will produce anything resembling a desirable result to the vast majority of people in that crowd.

Cycloptichorn

0 Replies

@ebrown p,

Robert Gentel may be thinking of genetic algorithms for swarms - mathematically these do produce better results for some apps.

On non-mathematical crowds, the classic work is this one:

Quote:

The conclusion to be drawn from what precedes is, that the crowd is always intellectually inferior to the isolated individual, but that, from the point of view of feelings and of the acts these feelings provoke, the crowd may, according to circumstances, he better or worse than the individual. All depends on the nature of the suggestion to which the crowd is exposed. This is the point that has been completely misunderstood by writers who have only studied crowds from the criminal point of view. Doubtless a crowd is often criminal, but also it is often heroic.

http://etext.virginia.edu/toc/modeng/public/BonCrow.html

0 Replies

@ebrown p,

Neither do I actually. I mean if it's doing stuff with different people's photos it might be a bit related, but I don't get that connection either.

0 Replies

@ebrown p,

Quote:

In this case, work is being farmed out to individuals by a central system. I am not skeptical about this technology at all.

You aren't thinking this through all the way. How does the system know whether you typed accurately or not? How does it have 99.1% accuracy?

It used the collective to correct the individual.

Quote:

Incidently, this same technique is being used by spammers to defeat captcha systems. They set up their own captcha thingy (on a page with desireable content). But, they trick the users by simply echoing the captcha from the target site (e.g. Yahoo)... so their users think they are getting access to one site... but really they are giving automated spambots access to Yahoo accounts.

Similar technique, but there's a crucial difference. The spammers don't care if the users are right or not. If enough are right they get what they want and they don't care about the misses and wrong uses.

Recaptcha is digitizing books for important uses. It needs to be able to know if the user accurately typed in the words, and it's very different from captcha in that captcha aims to make the user solve a problem the technology already knows the answer to. They know what the letters are and they make you recognize and type them.

Recaptcha is a project that aims to do what the technology can't do. It is digitizing words that the system didn't know the answer to. It can only do this through the wisdom of the crowd. If one person provides an answer they don't know whether to trust it. But by having multiple people provide the answer they learn how to have the crowd do it's own quality control.

Get it yet? It's getting really tedious if you aren't even wiling to cede that there's possible utility for large-scale teamwork or that there is useful and intelligent collective data to be mined in large datasets. If we are still at the level where you deny the very possibility of these concepts then I think you are being too obdurate to really be helped by any good example of crowdsourcing. You seem convinced already that it doesn't exist because you've defined it as not existing.

You define the crowd as dumb so when I give you an example of Google using the crowd's information you portray it as "smart Google, dumb crowd" and stick to the mantra that the crowd was not wise.

If that's what you are going for, what's the point of giving you good examples of this? You just dismiss it with a facile characterization of the crowd and ignore that something very valuable came out of it.

I guess this all boils down to this: are you interested in the prominent applications using these concepts or are you asking for something that will fit your personal definition of "wise"? Because if you've already decided that crowds aren't wise by definition and are going to dismiss all the examples merely by calling the crowd dumb then this is a very pointless exchange.

1 Reply

@ebrown p,

Ebrown wrote:

Wikipedia is an interesting example... individual edits as a collection. However, wikipedia has the idea of "editors"... a group of individuals that control and direct the crowd. Even in wikipedia, the crowd by itself is a disaster.

Actually, this is not true.

Have you ever edited Wikipedia pages? I've edited a bunch of pages. Although there are editors (volunteers as far as I know, who have themselves risen up from the "crowd", at least that's how I understood it), they really rarely ever intervene.

If you go to the most controversial political and science issues, you will find enough of the warning labels - Content disputed, etc. But if you take a random sample of Wikipedia pages (by using the random page option), you'll find that such pages are really very much the exception. In my experience, in something between 90% and 99% of the cases, the pages are collaboratively written by users, without editorial interference.

Wikipedia simply attracts enough users who want to edit, that errors and deceptions are fairly quickly picked up and corrected without an "official" ever having to butt in. And note that even when editors intervene, it is most of the time just to slap a warning sticker on the content in case of dispute - they usually dont dig in and start correcting texts. Instead they slap on a label, and then wait for the "crowd" to recover the item to acceptable standards.

1 Reply

@Robert Gentel,

You are mischaracterizing and over-hyping recaptcha. Recaptcha is a great idea, but it isn't magic.

The coolness of recaptcha (and other similar ideas) is that it gets lots of people to each do a little bit of work (for free). This means that lots of work gets done... but there is no inherent magic in the crowd.

In these tagging tasks-- increasing the number of people doesn't make the group any smarter (it just improves capacity).

Two or three people in a room could read 100 words with the same accuracy that 1,000,000 people could.

There is no mystical crowd phenomenon here.

4 Replies

@ebrown p,

Quote:

Two or three people in a room could read 100 words with the same accuracy that 1,000,000 people could.

http://upload.wikimedia.org/wikipedia/de/thumb/5/50/Dichtefunktion.png/360px-Dichtefunktion.png

Quote:

There is no mystical crowd phenomenon here.

Nobody said so.

0 Replies

@nimh,

I love Wikipedia... however it has well-known problems of exactly the type you are downplaying which have required banning users and locking topics. And, it is still not very difficult to see (or add) incorrect information (I take it you are not a Colbert fan).

But even this isn't an example of a mystical crowd wisdom. When I edit articles, I go in and correct the parts that bother me. I am working as an individual and I leave the page in a state that is acceptable to me as an individual.

Even so, I am not skeptical of collaboration... the applications I am skeptical of involve semantic filtering and tagging.

2 Replies

@ebrown p,

Quote:

The coolness of recaptcha (and other similar ideas) is that it gets lots of people to each do a little bit of work (for free). This means that lots of work gets done... but there is no inherent magic in the crowd.

http://i129.photobucket.com/albums/p205/huklbre/smileys/doh.gif

Now you're gettin' it.

0 Replies

@ebrown p,

Quote:

You are mischaracterizing and over-hyping recaptcha.

Where?

Quote:

Recaptcha is a great idea, but it isn't magic.

I never said it was magic. I said it was a good idea. Now you are asking for examples of "magic"?

Quote:

Two or three people in a room could read 100 words with the same accuracy that 1,000,000 people could.

Depends on the two or three people. Randomly selected, they can't do as well as a broader selection.

Quote:

There is no mystical crowd phenomenon here.

Nothing mystical going on here and I never spoke of anything like this. This thread seems more about obtuse than mystical. I can't satisfy your moving definition of what would constitute a good example of crowdsourcing because you just look at them and say "it is not good".

0 Replies

@ebrown p,

ebrown wrote:

The coolness of recaptcha (and other similar ideas) is that it gets lots of people to each do a little bit of work (for free).

The coolness of "crowd wisdom" is that it gets lots of people to collectively provide the data you need to make your product better (for free).

How big is the difference of opinion here, really, rhetorics and immediate response to the a2k voting system aside?

Now how can having lots of people collectively provide data make your product better? For one, as you pointed out, it helps you predict the crowd's own behaviour far more effectively than any estimate by an individual or office team. You have (somewhat conveniently) defined this part of the "crowd wisdom" concept as irrelevant to your question, but that doesnt change that it is part of the concept you're discussing and dismissing. (It certainly seems the issue at hand re the a2k voting system, where the point is to get the topics on top that are most likely to interest people).

Can it also serve to more efficiently arrive at an objective truth? That's where the recaptcha example comes in, as I understand it. Do 1,000,000 users typing in a ambiguous code together have a higher rate of success in typing it in correctly than 2 or 3 people? I would say that's pretty obvious: if you have 100 people typing in the same code, chances that the one that most of them type in is the correct one seem high, to me. Whereas one or two individuals, over the course of their working day, can easily make mistakes and oversights. That's what I think Old Europe's graph pertains to (?). Are you saying you don't believe this is true? On the basis of research you read or the like, or on the basis of a personal guesstimate like mine? (Or partly on the basis of an instinctive reaction to the idea of crowds deciding something?)

0 Replies

There is no Wisdom in Crowds

Related Topics

Quick Links

My Account

able2know