What is Sourcefinder?

Sourcefinder is about testing the performance and quality of the Duchamp Sourcefinding application. We've built a simulated cube of the sky containing various radio sources, and it's the job of Duchamp to work out where the sources are. We plan on running Duchamp over the whole cube of simulated data to work how many of the radio sources in the cube it can find. Duchamp will need to be able to identify correct sources while keeping false positives to a minimum.

As this project is in beta, we welcome any sort of feedback, advice, or bug reports. You can either make a post on the forums or send us an email at icrar.tsn.website@gmail.com . We're happy to hear from you.

Join Sourcefinder

User of the Day

User profile Profile [Bro.Inc] AdM
· Bro.Inc · Equipo poco más que familiar, pero con muy buenos propósitos ·

News

Changelog 22 March 2017
Deployed a new validator that will save invalid work units locally so I can investigate possible computation errors.

Modified the assimilator to delete completed work units from the project's upload directory.

Added a new log rotator to ensure log files don't get too large, and so searching them is easier.

Made a post to the boinc_projects mailing list for assistance with some of the virtualbox issues we've been having. Hopefully someone else has some advice or knowledge that will help us solve them.

This weeks stats are as follows:

Total Cubes: 6944
Total Results: 31836
Total Canonical Results: 5021. 72.3070276498%
Average Results Per Cube: 4.58467741935

Good Results: 14426. 45.3134815932%
Bad Results: 17410. 54.6865184068%

Client Bad: 459. 1.44176404071%
Client InProgress: 16672. 52.3683879884%
Client Good: 14705. 46.1898479709%

Server Inactive: 0. 0.0%
Server Unsent: 6. 0.0188465887674%
Server InProgress: 299. 0.939188340244%
Server Over: 31531. 99.041965071%
The main reason why we have so many bad results is due to both myself deploying a buggy version of the validator, and the parameters_4.tar.gz file being removed by the boinc file deleter.
Hopefully this coming week will be interesting because the new validator should be fixed, and both the validator and assimilator will store any invalid/errored work units, which will allow me to get a much better look at work units that are actually failing.

It also seems as though developing a non-virtualbox version of the project could be a lot more difficult than I first imagined, because the Duchamp application we run relies on a set of external astronomy libraries, and I don't know if I can compile/deploy them properly for windows or mac. I'm going to do some further investigation/experimentation and see if I can work something out.[/list]
22 Mar 2017, 4:43:38 UTC · Discuss

Changelog 15 March 2017
Fixed an internal issue with the file deleter not removing old wu files.

EDIT: I was testing out a new validator for this batch, but it's causing issues so I've had to revert back to the old validator.

I'm currently doing a local run of all 7000 test work units (which will probably take a while...) so we have a set of known correct results to compare the project's results to. This way we can get a much more accurate estimate of the correctness of the results.

I also found out that somehow, only 2000 of the 7000 work units were sent out last week. Don't know how that happened, but you'll be getting all 7000 this time :)

This weeks statistics:

Total Cubes: 1968
Total Results: 4164
Total Canonical Results: 1848. 93.9024390244%
Average Results Per Cube: 2.11585365854

Good Results: 3940. 94.6205571566%
Bad Results: 224. 5.37944284342%

Client Bad: 210. 5.04322766571%
Client InProgress: 14. 0.336215177714%
Client Good: 3940. 94.6205571566%

Server Inactive: 0. 0.0%
Server Unsent: 0. 0.0%
Server InProgress: 120. 2.8818443804%
Server Over: 4044. 97.1181556196%
We actually have a slightly higher percentage of good results this time (about 1%) which is good.
We're still aiming to get that up to 95% though.

Additionally, I'm wondering if it would be worth investing the time in a version of this project that doesn't use virtualbox. Currently, vboxwrapper and virtualbox issues have been the biggest causes of errors for this project, and a lot of that might be able to be solved by simply bypassing virtualbox all together and just having some native binaries for each platform.
The actual core of the client is written in Python, but I believe I can compile that to a binary using Cython (or a similar tool).
What do you all think, is it worth trying a non-virtualbox approach to avoid the problems we've been having?
15 Mar 2017, 0:24:11 UTC · Discuss

Changelog 8 March 2017
Profiles are now enabled.
Team import is now enabled.

Modified the /duchamp/join.php page to specify that this project requires VirtualBox.

Configured the Akismet anti-spam system and reCAPTCHA to prevent forum spam.

Wrote up a script to report the work unit statistics for a particular week.

There'll be more work coming out very soon after this changelog, 4x the amount as last week!

This week's current statistics are as follows:

Total Work Units: 1736
Total Results: 3711
Total Canonical Results: 1663. 95.7949308756%
Average Results Per Cube: 2.13767281106

Good Results: 3474.              93.6135812449%
Bad Results: 237.                   6.38641875505%

Client Bad: 236.                      6.35947184047%
Client InProgress: 1.               0.0269469145783%
Client Good: 3474.                 93.6135812449%

Server Inactive: 0.                  0.0%
Server Unsent: 0.                   0.0%
Server InProgress: 76.           2.04796550795%
Server Over: 3635.                 97.9520344921%
The key thing here are the canonical results, which we're already doing pretty well on with only 5% work units still in process since last Wednesday.
We're getting around 2 and-a-bit results (tasks) per work unit, which is close to what we want. Ideally, we need as close to 2 results per work unit as possible, as this means there were no computation errors and no instances of Boinc needing to send out duplicate tasks.
We have 93.6% successful results, with 6.4% result failures, so that's also a pretty solid starting point. Our aim is to get this value down to below 5%.
8 Mar 2017, 6:26:14 UTC · Discuss

Future transition from beta to production.
Hi everyone,

So I thought I should lay out my plans for how Sourcefinder is going to transition from its current beta state in to a production state.

From now on, the work units pushed out each week (on a Wednesday) will have their error rates measured. Any work units that fail due to computation errors or Boinc errors will be tracked and logged. Note that work units with invalid results wont be counted because they still represent a valid computation. This will continue until we're consistently getting less than 5% of work units being lost to errors, at that point I'll start looking at transitioning the current Sourcefinder server to a proper full production state.
From now on I'll be including the current work unit error rates in the weekly changelog.

If anyone has any thoughts on this, please let me know.
I'm also interested in whether my 5% target is too much, and whether I should relax it to 10% or something similar.

Thanks,
Sam
1 Mar 2017, 5:57:06 UTC · Discuss


Changelog 1 March 2017
Fixed an issue with automatic server tasks not running. The boinc stats dump should now update every 3 hours.

Added the following config entry to the project configuration:

<resend_lost_results>1</resend_lost_results>

Increased the credit multiplier from 1.3 to 3.0.

Implemented a new validator that will be tested throughout the week. Aims to solve some strange issues popping up with the boinc sample validator.

And of course, more work units to process. :)
1 Mar 2017, 5:49:30 UTC · Discuss

... more

News is available as an RSS feed   RSS


©2017 ICRAR