<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
		>
<channel>
	<title>Comments on: Who will hire the Zagat editors if this works?</title>
	<atom:link href="http://blogs.buzzillions.com/2008/07/02/who-will-hire-the-zagat-editors-if-this-works/feed/" rel="self" type="application/rss+xml" />
	<link>http://blogs.buzzillions.com/2008/07/02/who-will-hire-the-zagat-editors-if-this-works/</link>
	<description>What we do for your reviews</description>
	<lastBuildDate>Thu, 12 Nov 2009 19:35:57 -0800</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
		<item>
		<title>By: Daniel Tunkelang</title>
		<link>http://blogs.buzzillions.com/2008/07/02/who-will-hire-the-zagat-editors-if-this-works/comment-page-1/#comment-513</link>
		<dc:creator>Daniel Tunkelang</dc:creator>
		<pubDate>Wed, 02 Jul 2008 23:16:15 +0000</pubDate>
		<guid isPermaLink="false">http://blogs.buzzillions.com/?p=279#comment-513</guid>
		<description>I tried Pluribo. It&#039;s intriguing, but it&#039;s certainly not a complete solution. I see two main concerns.

First, its current implementation is limited to a small set of product categories, and it&#039;s not clear to me how much work is required on their part to add a new category--or what is their granularity of product categories.

Second, while the summaries are a great starting point, they are a bit too lossy for me. They don&#039;t give me a holistic picture of the product, but instead make me feel like I&#039;m just seeing a random pro or con.
Compare them to, say, the sets of scores used to rate products on Circuit City, where each product type typically is rated along 4 dimensions. There, users explicitly assign scores for each dimension. Or compare them to the Buzzilions approach of summarizing reviews as weighted lists of pros and cons--again soliciting them from users rather than trying to extract them automatically from text.

Finally, I love playing with NLP algorithms as much as the next computer science PhD, and we do our share at Endeca. But information extraction algorithms (which should really be called heuristics) will never be 100% accurate. Sometimes they are the best option we have. But, in this case, why guess when you can just ask the user?</description>
		<content:encoded><![CDATA[<p>I tried Pluribo. It&#8217;s intriguing, but it&#8217;s certainly not a complete solution. I see two main concerns.</p>
<p>First, its current implementation is limited to a small set of product categories, and it&#8217;s not clear to me how much work is required on their part to add a new category&#8211;or what is their granularity of product categories.</p>
<p>Second, while the summaries are a great starting point, they are a bit too lossy for me. They don&#8217;t give me a holistic picture of the product, but instead make me feel like I&#8217;m just seeing a random pro or con.<br />
Compare them to, say, the sets of scores used to rate products on Circuit City, where each product type typically is rated along 4 dimensions. There, users explicitly assign scores for each dimension. Or compare them to the Buzzilions approach of summarizing reviews as weighted lists of pros and cons&#8211;again soliciting them from users rather than trying to extract them automatically from text.</p>
<p>Finally, I love playing with NLP algorithms as much as the next computer science PhD, and we do our share at Endeca. But information extraction algorithms (which should really be called heuristics) will never be 100% accurate. Sometimes they are the best option we have. But, in this case, why guess when you can just ask the user?</p>
]]></content:encoded>
	</item>
</channel>
</rss>
