This will be tough to explain but....
What do you do when you are developing a software that needs to deal with large amount of variations of semi structured data from the web ?
I told myself it was finished but today feeding it more random sample, it's failing however progress is being shown as more bugs are being fixed.
I feel like I'm so close yet it also feels like every step forward results in two steps backward due to more bugs being discovered. I'm not trying to be a perfectionist but I feel that it needs to be able to deal with most variations in semi structured data to be useful to my future users.
I have gained strong interest from my past clients I worked for in this software but I feel that once they realize, they are having poor rate of success, I will lose them.
Mentors say that I should be releasing early and often in small increments. However, I feel that currently, my software is not able to even perform very well.
My patience is running out and I am being burnt out. I have been working on this software and researching for the past 2 years. My release date is late January but with the amount of bugs (caused by large variations in sample size) makes things difficult.
I'm not sure if I am expressing myself correctly but this software is not a typical CRM or simple game with clear cut features. This software function at it's minimum is to deal with large variations across semi structured data found on the web but to reach this goal seems to require much more testing and fixing.
Every bug I fix, does bring progress, but it just seems endless.
**UPDATE: Happy New Year ! Hi everyone, I realized why I was keep getting the same bugs. Problem is my algorithm overlooked a very simple parameter (one of the answers here gave me an idea). Thank you James. It makes perfect sense why I keep seeing the similar bugs over and over !
I wish there was a way to select more than one answer ! All of the answers here are equally great !
The lean startup model is fantastic and I support it wholeheartedly, but the truth is, it doesn't work for every type of business.
There are some business models where it is absolutely critical to your market to release a product that has a full feature set and a high accuracy.
It sounds like you need to take some time to get your head out of the coding and think about your software as a business. Is it critical to release with the degree of accuracy your gut is telling you? If so why? What will the impact on your customers be if the software isn't as accurate as you would like?
If a high degree of accuracy is imperative, maybe you need to look for angel funding to get to the point you need to before release. Not the lean startup model, but still a relevant one depending on what your business model is.
Take a deep breath and hang in there. Every piece of software has bugs, so don't let them get you too frustrated. I'd recommend doing something like the following:
If you don't feel like your main features are stable enough by your release date, push back the release. Delays happen in software all the time and it will be easier to overcome a delay by a few weeks or even months than the initial bad reputation that would result from your main features not working.
Cut scope.
Your basic problem is that there is a lot of data on the internet, you can't possibly support every type of data - it's just not going to work, by the time you add support for all types of data that exists today new types will pop up - you will never finish.
Think about your first client (it can be an imaginary client), think about the smallest subset of data they absolutely need - that is the smallest subset where using your software provide any value what so ever not the smallest subset you think is acceptable.
Release this initial version and continue to prioritize new data types by the frequency your customers are failing to process them.
(I don't know exactly what data you are trying to process, so I may be wrong but from your question I get the feeling that it is one of those endlessly dynamic types of data where you can't possibly support everything).
Without knowing the specifics of your product I can't offer any fixes. But, if you have been working on this for 2 years and you are still discovering major problems, it sounds like you have a fundamental problem with your parser/algorithm/methodology.
Are you just collecting more and more dataset/samples then 'bug fixing' to handle each 'exception'? did you start off with a clear concept how you were going to solve the problem, or are you guessing your way through each new data sample?
To be the bearer of bad news - you might need to stop, take a deep breath then re-visit your core code and decide if it is properly designed to handle the job. Do you need a module approach? e.g if you are parsing web logs you might need a apache module, iis module etc where you can add specific features instead of a monolithic one-thing-does-it-all-and-will-rule-the-world. It might give you a better marketing/sales approach too. When a customer has a new/different data set you can sell/publish a new module