Home About The Codist RSS Feed

You've Built a Great Technology, Now What? (My Dilemma)
Jul 10, 2007 12:24 perm link Readers: 769

This post is a little different, in that I don't have any clever answer or solution to blog about. The issue is this: I've spent the last few months building a search technology that's a little different than anything I've seen done; it's actually more of a "research engine" in fact. The questions are pretty basic: is it worth something and what do I do with it?

Now I'm not going to lay out everything here in any great detail, but bear with me in my search for answers (a pun). I hope in this process others can learn from my experience.

I've built a functional prototype and wrote a paper on the concept, enough to see it has merit. Most search technologies are focused on finding something you already know something about (buy ipod). What if you really aren't sure what you are looking for? Before you laugh and think this is a dumb question, imagine being a patent examiner looking for prior art, or an FBI agent looking for evidence, or a writer researching a story. In all these cases your knowledge of what constitutes an "answer" isn't well know up front. You need to examine a large body of information with only a vague idea of where to start. You'll know answers when you see them.

That's what my technology is aimed at. I came up with this idea a decade ago but never had the time to work on it. I figured it was likely to be done by someone eventually but so far I haven't seen it. So instead of doing consulting (and making a living!) I took the time to build this to the point of knowing it works and being able to demonstrate it.

The possibilities are fairly obvious but the path is not.

(1) Startup on my own

(2) Startup with investment (VC)

(3) Find a partner (existing search or related company)

(4) Give it away

The major difficulty with all but #3 is that this is a technology not a product. It needs to scale to much larger document collections (my prototype is 18,000 small documents) and needs other infrastructure to make it a viable web application. Once it's scalable it needs tuning to whatever kind of document collections it will be applied to. One thing this won't do is replace search engines like Google; it's much more suited to vertical document collections not generic web pages.

My preference is #3, either as partner or even "captured" as an employee to work on completing it as a product. I contacted Google via their contact form and got no reply, I figured it was a black hole (everyone contacts Google) so it's no surprise. Google would make sense as they already have the worlds largest search infrastructure. There are some 300 search engine companies, not to mention other companies with search needs or products.

I have more than 60,000 contacts via my linkedIn profile which is nice but what are the right contacts? Connections by themselves aren't useful unless you can make a sufficient impression that there is a reason to communicate further.

This is also the issue with #2, I live in the Dallas-Fort Worth area, which as far as I can tell is somewhat of a dead zone for software startups (like a donut hole, all the money and startups are more common on the country's edges). VC's only talk with people via real connections, and of course are mostly interested in startups with a revenue proposition; it's hard to sell something you can't sell to someone. I'm not Ycombinator material and in fact having done a startup before (in the old days) I'm not sure it makes much sense to do it again in this case.

#1 is not a real option, I've already invested what I have just getting to here, and need to make a living again soon. Working after work on something complex is too hard.

#4 is obviously easy and cheap but I'm not there yet; the biggest issue is that I wouldn't get to do much more with it (having a regular job and less time) although it would be nice to see what others could do with it. Even with a prototype it's not really something you just plug in somewhere; the concept is more powerful than the existing code.

Of course I could be imagining value where there is little; however, I have a lot of experience with search, with invention and with just plain thinking different so it's not based on nothing. Once I played with the prototype I could see it was useful; the question was were to go from there and could I make something of it.

Any comments or suggestions are welcome. Email is codistconsulting / gmail.

My Tags:

  • Alexandre Jacques: Jul 10, 2007 13:30

    Hi,

    I saw a video one of these days and found interesting. Maybe it can halp you:

    http://video.google.com/videoplay?docid=-3755718939216161559&hl=en

    I know its a bit of a "show" but there are some interesting things there.

    Regards and good luck.

  • Hagge: Jul 11, 2007 00:49

    How about Document Management companies?

    http://www.alfresco.com comes first to mind since we are in the process of implementing them but there's also EMC Documentum, Xerox DocuShare and IBM with all the different things they do (they bought FileNet some time ago which does document/content management). Oh and ZyLAB, they make the document management and search tools that powered the Enron-investigation aswell as the european Parmalat-investigation. Sound like something they would absolutely be interested in since they do that kind of thing for a living. Microsoft works with search too.

    Good luck and let us know how it goes!

    ...Hagge

  • John Lewicki: Jul 11, 2007 05:55

    This sounds very close to some of the ideas in the Endeca search engine. Have you compared your ideas to that product?

  • Daniel Quinn: Jul 11, 2007 07:18

    Frankly, I think the best option is #4. Not only is it the easiest, but if your product is sound, it would solidify your reputation as someone who clearly knows what they're doing. There's a reason that people like Linus Torvalds have so much time to work on the projects they love, it's because big companies are throwing money at them based on their body of work and reputation within the community.

    It's your invention of course, and it's up to you, but imagine the implications of your software if it's held privately by a few companies rather than made available for free to every non-profit and security agency. As always we're all profit when things are Free.

  • codist: Jul 11, 2007 07:27

    I looked at these but none really go where I did. I do have a contact at IBM.

  • Add Comment

What the Heck Could You Do With 16 Exabytes?
Jun 26, 2007 18:35 perm link Readers: 4075

In 64-bit architectures, the address space encompasses 16 exabytes. MacOS X 10.5, for example, can address this much as virtual memory, although it "only" supports 4 terabytes of physical RAM.

Given that my first computer only had 4K of RAM this is an amazing number of bytes. I thought my first hard drive, at a healthy 5 MB, was plenty big at the time.

To put an exabyte into perspective, the ramp up from my first hard drive to a terabyte (you can buy a terabyte drive these days for around $300 US) is the same order of magnitude from a terabyte to an exabyte. Each time hard drive technology improves you hear people wonder what good is so much hard drive space, yet we never seems to have any trouble finding a use.

A single movie compressed on a DVD is around 5GB, an HD movie around 25GB. Working with raw HD data (even that is usually compressed somewhat) you need much more. So imagine you need 500GB to store a modest HD movie during editing. In an exabyte you could work on 2 million HD projects at a time.

In the early 90's a copy of Deltagraph (Mac) was around 2.5 MB in size, which shipped on several floppy disks. In an exabyte I could keep 440 billion copies.

The MMPOG game I play (Battleground Europe) uses about 700MB RAM during gameplay. In an exabyte there would be room for 1.5 billion times more data.

Google's data for its search engine is apparently around 1 petabyte or so, a mere 0.1% of an exabyte. In a 64 bit address space you could fit it 16,000 or so times.

It's a big number, which seems pointless to consider: who would ever use this much data, either on disk or in virtual memory? Yet technology continues to discover reasons to use more and more storage. I can imagine that some day the division between permanent storage (disk) and RAM will vanish; everything you work with will exist in a single address space. In this way an exabyte doesn't seem as far off as it appears.

The nasty fly in this exabyte ointment is of course software. How do we develop software than can take advantage of almost limitless address space, not to mention tens, hundreds or even thousands of processors, or even enormous grids of these machines? Somehow software evolution has to speed up or all that hardware potential will be wasted.

Another fly people don't think about (but Google does) is power - at some point you have to provide juice for all these bytes.

So this is mere speculation for now, we still have a long way to go before a 64 bit address space seems tight. Perhaps by then we can simulate the human brain and let the computer figure out what to do next.

Of course by the time 128 bit address spaces are needed, we might all be obsolete and it won't matter.

My Tags:

  • Eric TF Bat: Jun 26, 2007 20:35

    The standard answer to any computer-related question in the form "What the Heck Could You Do With [some amount of memory capacity]" has remained constant since the invention of the computer: fill it up with porn!

    Well, that's the common answer. Me, I'd just run Emacs and it'd all fill up eventually without any further effort.

  • Sidu: Jun 27, 2007 03:27

    @Eric: lol. Good one.

  • gwenhwyfaer: Jun 27, 2007 09:31

    Not that it matters too much. The amount of memory that can keep up with the computer will probably stay the same 64k or so (give or take a couple of powers of 2) that it's been for the last 30 years...

  • gwenhwyfaer: Jun 27, 2007 09:32

    s/the computer/each processor/

  • Michael Speer: Jun 27, 2007 09:56

    Store my future fully three dimensional holograph projection films and 3DHP games. That's a lot of textures changing over the course of the material which must be stored along with realtime alterations and animations along with the locations of the individual specks of dust that must be projected.

    If it exists, we will find a use for it.

  • Vic: Jun 27, 2007 11:15

    Hey, awesome article, but i believe someone has stolen it and is not citing the source.

    http://technowirenews.blogspot.com/2007/06/what-could-you-do-with-16-exabytes.html

  • Adam Ierymenko: Jun 27, 2007 11:52

    Here's some:

    Huge games

    Evolutionary computation

    Huge realistic simulations

    Computational physics

    Huge CAD projects (e.g. doing CAD on a whole city for urban planning)

  • Gary: Jun 27, 2007 12:55

    Good pickup, Vic. I posted a "this material is stolen from ..." comment

  • Chad Crabtree: Jun 27, 2007 12:56

    Or a Whole SPACE station!

  • ratsbane: Jun 27, 2007 13:16

    Umm... you could use it for word processing... and... um, email and stuff.

  • codist: Jun 27, 2007 13:21

    Apparent the person (MARCHÉ BOURSIER) at technowirenews.blogspot.com is apt to steal content from just about anyone without attribution. Unfortunately there is no contact information and blogger's DMCA takedown procedure involves snail mail, so there's not much I can do. However, if all of you would flag the blog as objectionable, maybe it would get taken down.

  • B: Jun 27, 2007 14:17

    In ten years you will probably wonder how you ever thought the address space was small. Think back to when 32bit was introduced, 4GB of address space was unimaginable back then.

  • Manuel: Jun 27, 2007 15:02

    Nice musing.

    Hm, you could consider how fast you can sweep (e.g. read all memory). Hypertransport can access 22GB per second, so we get 2**64/(22*2**30) = 780903144 seconds we spend reading from the first to the last byte. This is about 24 years.

    If we can push the bus speed by factor 1000 then it would take 9 days. If we could push the bus speed by factor 1M then it would take 13 minutes. A speedup of 1M is a lot, however, and I wonder whether physics will carry that much more electrons through copper in the mid term future.

  • zerokewll: Jun 27, 2007 19:17

    One word: Skynet

  • Chui: Jun 28, 2007 00:11

    I'm not sure if there's a market for porn in "multi mega super high def", but if there is, exabyte storage is the way to go.

    :)

  • Medbob: Jul 02, 2007 10:52

    Along the same lines as Manuel's comment, I'm having increasing trouble moving data from disk to memory. It seems that the storage paradigm is moving faster than our ability to move and process the data.

    That being said, after taking 10 minutes to load up my 24 million row database in memory, it would certainly run a lot faster in memory than it does on disk.

  • Add Comment

If I Had A VC's Ear, What Would I Pitch?
Jan 20, 2007 20:02 perm link Readers: 699

Venture capitalists are known for only talking with people they know, or with people who know people they know. It's how the game is played; if they simply talked with everyone who walked in the door, they would never get anything done. Of course that makes it tough if you lack contacts.

In the mid-80's I started a company (Data Tailor) to build and market an unusual spreadsheet program (Trapeze, of which I will write a later posting), and raised money from various contacts. It was a miserable experience but the only way you could start a software company in Texas at that time. To bankers in this area then "software" meant clothing (this actually happened to me). Long before the dot com boom, VC money was not really an option for a small startup. We eventually sold the code to another company, and went on to develop Persuasion and Deltagraph under contract. The stockholders got a little back, lots of tax writeoffs, and my foray into the software marketing business ended sadly.

Of course today is a completely different era, but you still need contacts to raise money (unless you have well-to-do friends or family). The good news is that money is available at a much smaller level than it was 20 years ago. But if you don't have money on your own (and I don't really) and your contacts are nice but not linked in much (yes I know that's a pun now), it's still difficult to develop an idea (an old adage is "an idea is just an idea without a management team behind it"). So if a magical VC leprechaun appeared to me, what would I pitch?

I have two ideas. One I have been mulling over for about 10 years in the search arena, but no one has approached my idea yet during that time (that is public anyway) so I will hold on to that one still.

The other isn't really completely original, but then virtually nothing is. It's funny how some startup will get funded, and yet so will the second, the third, sometimes even the nth startup in the same space (think ajax start pages). Most of these will fail and wind up in the dead pool. It's often not how original the idea is, nor how timely (though it helps!) but how well it gets executed by the team. Sometimes it's just pure luck that the timing is right.

In 1993 Deltagraph was out of our hands and we tried to come up with something new to work on. I came up with the idea of a distributed instant messaging application for Appletalk called Intercomm. After working on it for a bit (it had instant messages, store and forward messages, polling and a fully distributed server architecture) I gave up, and the rest of team couldn't find anyone to publish it so it was abandoned. A couple years later the internet boom started. Oops, too early, and not the right target (Appletalk instead of TCP/IP). Soon thereafter IM became a big (although profitless) business. Oh well.

OK, so what idea would I pitch? If you need to contact a set of people by phone, email, fax, pager, IM or whatever all at the same time and have an audit trail (so you know who was contacted when), then you need something like this.

Imagine a soccer coach canceling practice (contact 12 soccer moms on the go or at work), a school system wanting to contact every truant student's parents, a city disaster management team needing to be called together when the big storm hits, or even the city calling everyone in town to boil the water first. Interestingly there are companies out there doing some of this but no one has a single broad vision of how useful this is. Some examples are Call-Em-All Voice Broadcasting Service and One Call Now. The problem is they simply focus on calling phones. What I imagine would be even more useful is to combine all forms of communication, so you can code the system to try multiple phone numbers, emails, IMs, pages, whatever, until a satisfactory response is received. The original message can be prerecorded, or even typed, and then triggered from any communication type the system supports (including the web).

The key to such a business of course is reliability; it would be pretty useless to try warning a city of impending doom and then have the system crash. That's the real success, ensuring a flawless, highly scalable (burstable) response which may occur infrequently. So far in this industry that's a tough one to guarantee.

The nice thing about this business is you can make real money, either by charging a fee to list, a fee to contact, or some combination. You could even let the soccer coaches use it for free. Unlike the typical web 2.0 site, you are actually offering a tangible service people would pay for (and do, reading those two companies websites). I don't know how many times I've wanted to be called or call a group of people about a changed time and never could get everyone. You do have to plan a bit ahead but once you have a group listed, there are a lot of things you can use this for.

So what's needed to make this work? People who have lots of experience in building voice systems (which is easier today with VOIP and voice response systems that actually understand people), high scalability and reliability (not as easy as it looks) and some nice web UI's. Not something you can do in your garage with couch change. Perfect for a group of experienced folks who know people who know VC's.

So if you are one of those folks, go do it.

As for me, I will have to wait and see if I can pitch the other idea someday. In the meantime, back to looking for work.

My Tags:

Name:


Optional URL:


Comment:


Save Cancel

Copyright © 2007 By Andrew Wulf