Re: [fiji-devel] Imglib2: using threadpools in core algorithms

classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

Tobias Pietzsch
On Dec 5, 2013, at 11:56 PM, Curtis Rueden <[hidden email]> wrote:

Hi again,

> * Encapsulates internal dependencies such as java.util.concurrent
> (e.g., Avian doesn't have that package!).

And Android / J2ME doesn't have java.util.concurrent either. I am interested in porting ImageJ2 to Android (which would necessitate a port of ImgLib2 to Android). The more "weird" Java packages we use, the more involved it will be to refactor those usages out later to make such a thing possible. The "swappability" of SJC services will obviate the need to do it in some cases.

On a closer look, ThreadService uses java.util.concurrent.Callable and import java.util.concurrent.Future.
It also extends java.util.concurrent.ThreadFactory. So the Android/AVIAN argument is not really valid.

To reiterate my previous question why not extend the ExecutorService interface instead?
One answer I could give to that myself is that we do not need/want the full ExecutorService, e.g., an imglib algorithm should not be allowed to shutdown() the ExecutorService.
On the other hand, we do not need/want the full ThreadService, e.g. using newThread() from the ThreadFactory is not wanted.

So could we maybe extract the part of ExecutorService that we are interested in (invokeAll, invokeAny, and submit variants) into a new interface and make ThreadService extend that?
This would still not allow us to pass a standard ExecutorService into imglib algorithms but a ThreadService. I could live with it.
The question then is whether this should live in imglib, in scijava-common, or somewhere else. I hesitate making scijava-common a dependency of imglib. I know just having scijava-common as a dependency does not force me to use the application container but still… Making imglib a dependency of scijava-common is not a good idea either.
Hmm...

best regards,
Tobias


Regards,
Curtis


On Thu, Dec 5, 2013 at 4:48 PM, Curtis Rueden <[hidden email]> wrote:
Hi Tobias,

> Curtis, could you elaborate on why you prefer scijava-common
> ThreadService? I just had a look and, as an interface, I cannot see
> what would make it preferable to ExecutorService. For me a point in
> favour of ExecutorService is that it is in the JDK. What additional
> value would ThreadService provide?

For me, the fact that we control the ThreadService API contract is a point in its favor. If we need to add some functionality, we cannot add it to ExecutorService, because it is part of core Java. But the SciJava Common API can adapt to our needs.

> Would it be an option to let ThreadService extends ExecutorService?

As a rule of thumb, I prefer composition to inheritance [1]. The SciJava Common ThreadService *has* an ExecutorService internally. Right now, that is not exposed in the API contract, but we could do so (i.e., a "setExecutor" method). That would also solve Lee's single-threaded use case without requiring him to override the DefaultThreadService implementation itself.

> Correct me if I'm wrong, but I thought the same was the idea about
> ExecutorService…

Yes. ExecutorService is an interface, so if the ImgLib2 API contract uses that interface, it is true that callers can pass whatever kind of ExecutorService they want. And that is a form of dependency injection.

The SciJava Common approach uses SezPoz to discover plugins (including services) and organize them by priority. If a ThreadService is found on the classpath with higher priority than the DefaultThreadService, it will take precedence for any SciJava application context that needs a ThreadService.

Advantages of ExecutorService:
* Avoids dependency on SciJava Common, and potential overhead of SJC application context.
* Easy to override threading behavior on a case-by-case basis (i.e., pass different ExecutorServices to different algorithms).

Advantages of ThreadService:
* Can improve the API as needed.
* Encapsulates internal dependencies such as java.util.concurrent (e.g., Avian doesn't have that package!).
* Easy to globally override threading behavior for an entire application context.

There are surely other reasons to go one way or the other, but that's all that's coming out of my brain at the moment.

Regards,
Curtis




On Thu, Dec 5, 2013 at 4:27 PM, Tobias Pietzsch <[hidden email]> wrote:

On Dec 5, 2013, at 10:32 PM, Johannes Schindelin <[hidden email]> wrote:

> Hi Tobi,
>
> On Thu, 5 Dec 2013, Tobias Pietzsch wrote:
>
>> Curtis, could you elaborate on why you prefer scijava-common ThreadService?
>
> Very easy: scijava-common's ThreadService can be overridden by your own
> implementation. Such as a ThreadService backed by KNIME's own…

Correct me if I'm wrong, but I thought the same was the idea about ExecutorService…
best regards,
Tobias

>
> Remember, not everybody starts everything on a single-user desktop
> machine, in an Oracle JVM launched from within Eclipse with a Swing/AWT
> user interface ;-)
>
> Ciao,
> Dscho





_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel

signature.asc (465 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

Jean-Yves Tinevez
Hi all

Ok thank you very much this is inspiring.

I see the discussion is still going on and that we (the team) did not reach a definitive conclusion yet.

I just want to add a question item to the discussion:
Sometimes you need to have information on the multithreading configuration that has been set upstream.

For instance, taking the example of TrackMate spot detection:

Each frame of a movie can be processed concurrently: I generate a task for each frame, and can feed this task to any interface we are discussing right now. For this, I do not need to know how many threads are allocated to the service: it will decide how to process the tasks I generated.

By there are many algorithms in imglib2 that can process a single image in a multithreaded way, by splitting it into small pieces. For instance, you can compute the Gauss convolution on 2 threads, and it will split the source image in 2. For this, the algorithm needs to have some info on the "multitasking resources" available right? If you have 24 workers, and that each worker receives one frame to segment, the segmenter needs to know it is unwise to split the source image over several extra workers. No?

How is this achieved in real world applications?
best
jy
_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

Tobias Pietzsch
Hi JY,

exactly. That's what I meant when I wrote
One thing that I'm missing with the ExecutorService is getting an estimate of many worker threads there are. This is useful to guide a heuristic of how many tasks a problem should be split.
We now just use Runtime.getRuntime().availableProcessors() for that purpose, but it would be nice to have something better. 
Maybe I wasn't being clear enough.

Usually (also in the Gauss) I split the task into number of subtasks that is a multiple of the number available workers.
The reasoning behind this is that the workers might finish a task at different speeds. This can be caused by other stuff (even outside the JVM) using resources. Or it can be caused by simply the tasks being more or less demanding. For instant in spot detection, probably a frame that is completely black will finish a bit earlier than a normal frame.
So if there are N workers and N tasks, N-1 workers are waiting for the slowest task to finish. Splitting into more tasks, once a worker is finished, it will just pick the next unprocessed task.
With very many tasks, I guess there is a point where the task management overhead is slowing it down, so a reasonable heuristic is required (to be defined:-)

best regards,
Tobias

On Dec 6, 2013, at 9:29 AM, Jean-Yves Tinevez <[hidden email]> wrote:

Hi all

Ok thank you very much this is inspiring.

I see the discussion is still going on and that we (the team) did not reach a definitive conclusion yet.

I just want to add a question item to the discussion:
Sometimes you need to have information on the multithreading configuration that has been set upstream.

For instance, taking the example of TrackMate spot detection:

Each frame of a movie can be processed concurrently: I generate a task for each frame, and can feed this task to any interface we are discussing right now. For this, I do not need to know how many threads are allocated to the service: it will decide how to process the tasks I generated.

By there are many algorithms in imglib2 that can process a single image in a multithreaded way, by splitting it into small pieces. For instance, you can compute the Gauss convolution on 2 threads, and it will split the source image in 2. For this, the algorithm needs to have some info on the "multitasking resources" available right? If you have 24 workers, and that each worker receives one frame to segment, the segmenter needs to know it is unwise to split the source image over several extra workers. No?

How is this achieved in real world applications?
best
jy


_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel

signature.asc (465 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

Lee Kamentsky
In reply to this post by Jean-Yves Tinevez
Hi all,

On Fri, Dec 6, 2013 at 3:29 AM, Jean-Yves Tinevez <[hidden email]> wrote:
Hi all

Each frame of a movie can be processed concurrently: I generate a task for each frame, and can feed this task to any interface we are discussing right now. For this, I do not need to know how many threads are allocated to the service: it will decide how to process the tasks I generated.

By there are many algorithms in imglib2 that can process a single image in a multithreaded way, by splitting it into small pieces. For instance, you can compute the Gauss convolution on 2 threads, and it will split the source image in 2. For this, the algorithm needs to have some info on the "multitasking resources" available right? If you have 24 workers, and that each worker receives one frame to segment, the segmenter needs to know it is unwise to split the source image over several extra workers. No?

How is this achieved in real world applications?
I think we're in a lucky sweet spot where the size of our data is big and the number of processors is small. I think you want to amortize the cost of enqueueing the work, thread signaling and thread context switching over a pretty big chunk of data - and that's operating-system and processor-specific in my experience, so it's best to benchmark. If I remember correctly, as an example, Ilastik 0.5 breaks its images into chunks of 10K pixels and runs perhaps 100 filters on each, plus a random-forest evaluation. In that case, the large-grain operation makes the optimization question pretty moot since the computation is so expensive (and doing the chunking at the top level might be a good design choice).

It would be kind of cool if the service could give you an idea of the cost of running one future and of the size of the thread pool. This is notoriously difficult to measure, but a rough cut might be to find the minimum time from enqueing a future until its execution and multiply that by 2. If you knew the cost of running your algorithm on N pixels, you could figure out how to slice it. It would also be kind of cool if your imglib data container (or some helper class that encapsulated this expertise) could give you a collection of iterators appropriate for the cost of your operation and the thread service it would be run on.
best
jy

--
--
Please avoid top-posting, and please make sure to reply-to-all!

Mailing list web interface: http://groups.google.com/group/fiji-devel

---
You received this message because you are subscribed to the Google Groups "Fiji-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.


_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

Tobias Pietzsch
Hi,

Problem: Did anyone think about the possibility of deadlocks?

Lets take the scenario that Lee mentioned. We enqueue a lot of tasks that each trigger computation of a set of features for an image each.
Probably the features can be computed in parallel, so basically what a single task does is: create subtasks, submit to ExecutorService, then block until they complete.
The subtask for a single feature of a single image might again create subtasks for parts of the image and block until they complete.
This may easily lead to the situation that all threads in the ExecutorService block because they wait for subtasks that are not executed because all threads block… Right?

So unfortunately, in my opinion that rules out both ExecutorService and SJC ThreadService (unless heavily modified).


How to solve this? The perfect solution is ForkJoin from java.util.concurrent, http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html
Unfortunately this was introduced only in Java 7 and thats a problem because we want to be 1.6 compatible.

So what should we do?
There seems to be the option of using the Fork/Join framework from jdk1.6 http://homes.cs.washington.edu/~djg/teachingMaterials/grossmanSPAC_forkJoinFramework.html#installation. If I understand correctly, this is the exact implementation as used in jdk1.7.
The other option is to reinvent that particular wheel in scijava-common or imglib2.

best regards,
Tobias


On Dec 6, 2013, at 2:27 PM, Lee Kamentsky <[hidden email]> wrote:

Hi all,

On Fri, Dec 6, 2013 at 3:29 AM, Jean-Yves Tinevez <[hidden email]> wrote:
Hi all

Each frame of a movie can be processed concurrently: I generate a task for each frame, and can feed this task to any interface we are discussing right now. For this, I do not need to know how many threads are allocated to the service: it will decide how to process the tasks I generated.

By there are many algorithms in imglib2 that can process a single image in a multithreaded way, by splitting it into small pieces. For instance, you can compute the Gauss convolution on 2 threads, and it will split the source image in 2. For this, the algorithm needs to have some info on the "multitasking resources" available right? If you have 24 workers, and that each worker receives one frame to segment, the segmenter needs to know it is unwise to split the source image over several extra workers. No?

How is this achieved in real world applications?
I think we're in a lucky sweet spot where the size of our data is big and the number of processors is small. I think you want to amortize the cost of enqueueing the work, thread signaling and thread context switching over a pretty big chunk of data - and that's operating-system and processor-specific in my experience, so it's best to benchmark. If I remember correctly, as an example, Ilastik 0.5 breaks its images into chunks of 10K pixels and runs perhaps 100 filters on each, plus a random-forest evaluation. In that case, the large-grain operation makes the optimization question pretty moot since the computation is so expensive (and doing the chunking at the top level might be a good design choice).

It would be kind of cool if the service could give you an idea of the cost of running one future and of the size of the thread pool. This is notoriously difficult to measure, but a rough cut might be to find the minimum time from enqueing a future until its execution and multiply that by 2. If you knew the cost of running your algorithm on N pixels, you could figure out how to slice it. It would also be kind of cool if your imglib data container (or some helper class that encapsulated this expertise) could give you a collection of iterators appropriate for the cost of your operation and the thread service it would be run on.
best
jy

--
--
Please avoid top-posting, and please make sure to reply-to-all!

Mailing list web interface: http://groups.google.com/group/fiji-devel

---
You received this message because you are subscribed to the Google Groups "Fiji-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.



_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel

signature.asc (465 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

dscho
Administrator
In reply to this post by Tobias Pietzsch
Hi Tobi,

On Fri, 6 Dec 2013, Tobias Pietzsch wrote:

> On Dec 5, 2013, at 11:56 PM, Curtis Rueden <[hidden email]> wrote:
>
> > > * Encapsulates internal dependencies such as java.util.concurrent
> > > (e.g., Avian doesn't have that package!).
> >
> > And Android / J2ME doesn't have java.util.concurrent either. I am
> > interested in porting ImageJ2 to Android (which would necessitate a
> > port of ImgLib2 to Android). The more "weird" Java packages we use,
> > the more involved it will be to refactor those usages out later to
> > make such a thing possible. The "swappability" of SJC services will
> > obviate the need to do it in some cases.
>
> On a closer look, ThreadService uses java.util.concurrent.Callable and
> import java.util.concurrent.Future.
Yes, they do. At least Callable is so simple that it does not make sense
whatsoever to reimplement it.

> It also extends java.util.concurrent.ThreadFactory. So the Android/AVIAN
> argument is not really valid.

Whoa, slow! The *more* you take from the concurrent package, the *harder*
it gets to support Android/Avian. It's not "take it or leave it". So
Curtis' argument is a very valid one! Just because you use two interfaces
that are easily provided in a support library does not mean that you could
just as easily buy into the complete concurrent package, because that one
is *not* easily provided in a support library!

> To reiterate my previous question why not extend the ExecutorService
> interface instead?
>
> One answer I could give to that myself is that we do not need/want the
> full ExecutorService, e.g., an imglib algorithm should not be allowed to
> shutdown() the ExecutorService.

There you go.

> On the other hand, we do not need/want the full ThreadService, e.g.
> using newThread() from the ThreadFactory is not wanted.

In ImgLib2 core, no you do not need that right now. But scijava-common is
about striking a balance between what ImgLib2 needs and what other
scientific Java projects need.

Personally, I have a strong faith in ImgLib2 being able to cope with the
ThreadService providing the newThread() method and just not use it.

I also have a strong faith that both Lee with his SingleThreadService and
myself with whatever I have to implement to support the ThreadService in
Avian (if ever needed, really) will have a much smaller problem
implementing ThreadService than a full-fledged ExecutorService.

> So could we maybe extract the part of ExecutorService that we are
> interested in (invokeAll, invokeAny, and submit variants) into a new
> interface and make ThreadService extend that?

The idea of ThreadService (it being an interface already) was to *be* that
new interface.

> This would still not allow us to pass a standard ExecutorService into
> imglib algorithms but a ThreadService. I could live with it.

And again, scijava-common already has a default implementation using the
ExecutorService (which we will have to override in the case of Android or
Avian).

> The question then is whether this should live in imglib, in
> scijava-common, or somewhere else. I hesitate making scijava-common a
> dependency of imglib.

Yes, you made that point earlier.

Personally, I am less and less convinced that it is a good idea to go out
of our way to make ImgLib2-core a kitchen sink. It is about
multi-dimensional data processing, after all, not about common
functionality to scientific Java projects.

The name scijava-common was chosen to reflect the desire to put code used
commonly in scientific Java projects into that library rather than
reinvent the wheel in every scientific Java project.

And it is understandable that the same scientists who would not use
Bio-Formats for reading and writing AVIs (because the name suggests that
Bio-Formats is intended for biologists, and they are physicists, after
all) would refuse to link to ImgLib2-core for the useful stuff unrelated
to image processing because they do not want to process images.

Again, that is why scijava-common was started, and we put quite useful
functionality into it that all arose from the Fiji/ImageJ2 projects but is
of wider interest than just image processing, at the same time managing to
keep the library small and sweet.

> I know just having scijava-common as a dependency does not force me to
> use the application container but still… Making imglib a dependency of
> scijava-common is not a good idea either.

Exactly. I think it is a wise idea to trust Curtis on the separation of
concerns because he has spent literally the past four years on that
subject.

In particular, your intuition that imglib2-core should not be a dependency
of scijava-common makes absolute sense from the point of view that modules
need to be separated by virtue of their particular focus: the
scijava-common library's focus being quite obviously *the* base for
scientific Java programming, right?

Ciao,
Dscho
_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

Lee Kamentsky
In reply to this post by Tobias Pietzsch
Hi,
Nice catch, Tobias
On Fri, Dec 6, 2013 at 10:10 AM, Tobias Pietzsch <[hidden email]> wrote:
Hi,

Problem: Did anyone think about the possibility of deadlocks?

Lets take the scenario that Lee mentioned. We enqueue a lot of tasks that each trigger computation of a set of features for an image each.
Probably the features can be computed in parallel, so basically what a single task does is: create subtasks, submit to ExecutorService, then block until they complete.
The subtask for a single feature of a single image might again create subtasks for parts of the image and block until they complete.
This may easily lead to the situation that all threads in the ExecutorService block because they wait for subtasks that are not executed because all threads block… Right?

 They block AND are idle for no good reason - doubly bad.

So unfortunately, in my opinion that rules out both ExecutorService and SJC ThreadService (unless heavily modified).


How to solve this? The perfect solution is ForkJoin from java.util.concurrent, http://docs.oracle.com/javase/tutorial/essential/concurrency/forkjoin.html
Unfortunately this was introduced only in Java 7 and thats a problem because we want to be 1.6 compatible.

There have been recent attempts to solve this in  Python (e.g. http://greenlet.readthedocs.org/en/latest/). If you have a language with lightweight pseudo-stacks and pseudo-threads, it's easy to just swap "stacks" when you hit one of the blocking states and trade for a task that's unblocked. For Java and C, true stacks are big (1MB) and the swapping is not so good.

One semi-elegant solution is for the service to check whether the thread requesting redemption of a future is a worker thread. If so, the function that blocks just takes some work to be done off of the work queue, choosing the work queued by its thread and completes it. An idle thread is also allowed to take stuff off of this queue. So you need to attach thread identity to the objects being enqueued. There's no downside to making a worker thread complete the items queued by itself - the work has probably been queued  up optimally at a higher level.

So what should we do?
There seems to be the option of using the Fork/Join framework from jdk1.6 http://homes.cs.washington.edu/~djg/teachingMaterials/grossmanSPAC_forkJoinFramework.html#installation. If I understand correctly, this is the exact implementation as used in jdk1.7.
The other option is to reinvent that particular wheel in scijava-common or imglib2.

best regards,
Tobias


On Dec 6, 2013, at 2:27 PM, Lee Kamentsky <[hidden email]> wrote:

Hi all,

On Fri, Dec 6, 2013 at 3:29 AM, Jean-Yves Tinevez <[hidden email]> wrote:
Hi all

Each frame of a movie can be processed concurrently: I generate a task for each frame, and can feed this task to any interface we are discussing right now. For this, I do not need to know how many threads are allocated to the service: it will decide how to process the tasks I generated.

By there are many algorithms in imglib2 that can process a single image in a multithreaded way, by splitting it into small pieces. For instance, you can compute the Gauss convolution on 2 threads, and it will split the source image in 2. For this, the algorithm needs to have some info on the "multitasking resources" available right? If you have 24 workers, and that each worker receives one frame to segment, the segmenter needs to know it is unwise to split the source image over several extra workers. No?

How is this achieved in real world applications?
I think we're in a lucky sweet spot where the size of our data is big and the number of processors is small. I think you want to amortize the cost of enqueueing the work, thread signaling and thread context switching over a pretty big chunk of data - and that's operating-system and processor-specific in my experience, so it's best to benchmark. If I remember correctly, as an example, Ilastik 0.5 breaks its images into chunks of 10K pixels and runs perhaps 100 filters on each, plus a random-forest evaluation. In that case, the large-grain operation makes the optimization question pretty moot since the computation is so expensive (and doing the chunking at the top level might be a good design choice).

It would be kind of cool if the service could give you an idea of the cost of running one future and of the size of the thread pool. This is notoriously difficult to measure, but a rough cut might be to find the minimum time from enqueing a future until its execution and multiply that by 2. If you knew the cost of running your algorithm on N pixels, you could figure out how to slice it. It would also be kind of cool if your imglib data container (or some helper class that encapsulated this expertise) could give you a collection of iterators appropriate for the cost of your operation and the thread service it would be run on.
best
jy

--
--
Please avoid top-posting, and please make sure to reply-to-all!

Mailing list web interface: http://groups.google.com/group/fiji-devel

---
You received this message because you are subscribed to the Google Groups "Fiji-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/groups/opt_out.




_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

dscho
Administrator
Hi Lee,

On Fri, 6 Dec 2013, Lee Kamentsky wrote:
>
> On Fri, Dec 6, 2013 at 10:10 AM, Tobias Pietzsch <[hidden email]>wrote:
>
> > Problem: Did anyone think about the possibility of deadlocks?
>
> Nice catch, Tobias

Does anybody have a different reaction than "Oh well, that's right, we
cannot use the ExecutorService, then, but instead need to adapt the
ThreadService so it handles this one right"?

Ciao,
Dscho

_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

Tobias Pietzsch
In reply to this post by dscho
Hi Johannes,

> And it is understandable that the same scientists who would not use
> Bio-Formats for reading and writing AVIs (because the name suggests that
> Bio-Formats is intended for biologists, and they are physicists, after
> all) would refuse to link to ImgLib2-core for the useful stuff unrelated
> to image processing because they do not want to process images.
>
> Again, that is why scijava-common was started, and we put quite useful
> functionality into it that all arose from the Fiji/ImageJ2 projects but is
> of wider interest than just image processing, at the same time managing to
> keep the library small and sweet.
As far as I see it, a big part of scijava-common is to be a dependency injection framework.
It can provide an application context, it can harvest annotations from my classes, it can even modify my eclipse projects.
This is exactly the right thing to use if you are building an application.
It is absolutely not the right thing if you are building a library in my opinion.

best regards,
Tobias

On Dec 6, 2013, at 4:34 PM, Johannes Schindelin <[hidden email]> wrote:

> Hi Tobi,
>
> On Fri, 6 Dec 2013, Tobias Pietzsch wrote:
>
>> On Dec 5, 2013, at 11:56 PM, Curtis Rueden <[hidden email]> wrote:
>>
>>>> * Encapsulates internal dependencies such as java.util.concurrent
>>>> (e.g., Avian doesn't have that package!).
>>>
>>> And Android / J2ME doesn't have java.util.concurrent either. I am
>>> interested in porting ImageJ2 to Android (which would necessitate a
>>> port of ImgLib2 to Android). The more "weird" Java packages we use,
>>> the more involved it will be to refactor those usages out later to
>>> make such a thing possible. The "swappability" of SJC services will
>>> obviate the need to do it in some cases.
>>
>> On a closer look, ThreadService uses java.util.concurrent.Callable and
>> import java.util.concurrent.Future.
>
> Yes, they do. At least Callable is so simple that it does not make sense
> whatsoever to reimplement it.
>
>> It also extends java.util.concurrent.ThreadFactory. So the Android/AVIAN
>> argument is not really valid.
>
> Whoa, slow! The *more* you take from the concurrent package, the *harder*
> it gets to support Android/Avian. It's not "take it or leave it". So
> Curtis' argument is a very valid one! Just because you use two interfaces
> that are easily provided in a support library does not mean that you could
> just as easily buy into the complete concurrent package, because that one
> is *not* easily provided in a support library!
>
>> To reiterate my previous question why not extend the ExecutorService
>> interface instead?
>>
>> One answer I could give to that myself is that we do not need/want the
>> full ExecutorService, e.g., an imglib algorithm should not be allowed to
>> shutdown() the ExecutorService.
>
> There you go.
>
>> On the other hand, we do not need/want the full ThreadService, e.g.
>> using newThread() from the ThreadFactory is not wanted.
>
> In ImgLib2 core, no you do not need that right now. But scijava-common is
> about striking a balance between what ImgLib2 needs and what other
> scientific Java projects need.
>
> Personally, I have a strong faith in ImgLib2 being able to cope with the
> ThreadService providing the newThread() method and just not use it.
>
> I also have a strong faith that both Lee with his SingleThreadService and
> myself with whatever I have to implement to support the ThreadService in
> Avian (if ever needed, really) will have a much smaller problem
> implementing ThreadService than a full-fledged ExecutorService.
>
>> So could we maybe extract the part of ExecutorService that we are
>> interested in (invokeAll, invokeAny, and submit variants) into a new
>> interface and make ThreadService extend that?
>
> The idea of ThreadService (it being an interface already) was to *be* that
> new interface.
>
>> This would still not allow us to pass a standard ExecutorService into
>> imglib algorithms but a ThreadService. I could live with it.
>
> And again, scijava-common already has a default implementation using the
> ExecutorService (which we will have to override in the case of Android or
> Avian).
>
>> The question then is whether this should live in imglib, in
>> scijava-common, or somewhere else. I hesitate making scijava-common a
>> dependency of imglib.
>
> Yes, you made that point earlier.
>
> Personally, I am less and less convinced that it is a good idea to go out
> of our way to make ImgLib2-core a kitchen sink. It is about
> multi-dimensional data processing, after all, not about common
> functionality to scientific Java projects.
>
> The name scijava-common was chosen to reflect the desire to put code used
> commonly in scientific Java projects into that library rather than
> reinvent the wheel in every scientific Java project.
>
> And it is understandable that the same scientists who would not use
> Bio-Formats for reading and writing AVIs (because the name suggests that
> Bio-Formats is intended for biologists, and they are physicists, after
> all) would refuse to link to ImgLib2-core for the useful stuff unrelated
> to image processing because they do not want to process images.
>
> Again, that is why scijava-common was started, and we put quite useful
> functionality into it that all arose from the Fiji/ImageJ2 projects but is
> of wider interest than just image processing, at the same time managing to
> keep the library small and sweet.
>
>> I know just having scijava-common as a dependency does not force me to
>> use the application container but still… Making imglib a dependency of
>> scijava-common is not a good idea either.
>
> Exactly. I think it is a wise idea to trust Curtis on the separation of
> concerns because he has spent literally the past four years on that
> subject.
>
> In particular, your intuition that imglib2-core should not be a dependency
> of scijava-common makes absolute sense from the point of view that modules
> need to be separated by virtue of their particular focus: the
> scijava-common library's focus being quite obviously *the* base for
> scientific Java programming, right?
>
> Ciao,
> Dscho

_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel

signature.asc (465 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

Tobias Pietzsch
In reply to this post by dscho
Hi,

On Dec 6, 2013, at 7:28 PM, Johannes Schindelin <[hidden email]> wrote:

> Hi Lee,
>
> On Fri, 6 Dec 2013, Lee Kamentsky wrote:
>>
>> On Fri, Dec 6, 2013 at 10:10 AM, Tobias Pietzsch <[hidden email]>wrote:
>>
>>> Problem: Did anyone think about the possibility of deadlocks?
>>
>> Nice catch, Tobias
>
> Does anybody have a different reaction than "Oh well, that's right, we
> cannot use the ExecutorService, then, but instead need to adapt the
> ThreadService so it handles this one right"?
Here, I have a different reaction:
As I pointed out, this has been done exactly right in Java 7 Fork/Join framework (by people who presumably put more thought and experience into it than we could).
If the majority insists that we reimplement something similar, then at least let us use the same interfaces.

best regards,
Tobias

>
> Ciao,
> Dscho
>
> --
> --
> Please avoid top-posting, and please make sure to reply-to-all!
>
> Mailing list web interface: http://groups.google.com/group/fiji-devel
>
> ---
> You received this message because you are subscribed to the Google Groups "Fiji-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.

_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel

signature.asc (465 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

Curtis Rueden
Hi all,

After sleeping on it, I do not think SJC ThreadService is actually the way to go for providing better threading configuration of algorithms. But unlike Tobias, I do *not* think this is because SJC contexts are burdensome. Specifically, in response to Tobias's comment:

> [SJC] is absolutely not the right thing if you are building a library in my opinion.

I strongly disagree; SJC was created to benefit SCIFIO, and it has done so tremendously. I can elaborate on this if you like, but suffice to say that even libraries have configuration and state and services that they need to provide, and a unified context is an excellent way to do this. If you instead rely on statics, you will create something that cannot be extended. We learned this from painful experience with Bio-Formats!

Anyway, the reason I think SJC ThreadService is wrong for this use case is due to another bullet point I mentioned earlier:

> Easy to override threading behavior on a case-by-case basis
> (i.e., pass different ExecutorServices to different algorithms).

That is, SJC services are singletons, intended to store state for the application context. The threading configuration for running an algorithm might be different than that for another algorithm running in the same context. We need to be able to configure it individually. While ThreadService might be able to serve as a sensible default configuration (i.e., "just use SJC's ExecutorService by default"), it should not be forced on every algorithm.

> As I pointed out, this has been done exactly right in Java 7 Fork/Join
> framework (by people who presumably put more thought and experience
> into it than we could). If the majority insists that we reimplement
> something similar, then at least let us use the same interfaces.

I agree that it would be great if we could use something that has been vetted by the greater Java community. However, this makes porting to non-JavaSE-7 platforms much more difficult. Tobias, I know you pointed out some java.util.concurrent leakages into ThreadService -- we weren't being overly careful about it yet -- but my point stands that SJC has the potential to make those sorts of ports easier if we put effort into it. More Jenkins builds against alternative Java implementations would be a great start toward that.

Anyway, it's not that I *want* to reimplement something similar to Java 7's Fork/Join... it's that I don't have a good alternative if we also want to support those other scenarios. One possible compromise would be design our own agnostic interface, and provide an implementation in its own module which uses Java 7's Fork/Join. That way, it will be possible to provide an alternative implementation on platforms which don't support it. This sort of interface-driven extensible design is *exactly* what SJC seeks to make possible with its plugin framework.

Regards,
Curtis


On Fri, Dec 6, 2013 at 12:39 PM, Tobias Pietzsch <[hidden email]> wrote:
Hi,

On Dec 6, 2013, at 7:28 PM, Johannes Schindelin <[hidden email]> wrote:

> Hi Lee,
>
> On Fri, 6 Dec 2013, Lee Kamentsky wrote:
>>
>> On Fri, Dec 6, 2013 at 10:10 AM, Tobias Pietzsch <[hidden email]>wrote:
>>
>>> Problem: Did anyone think about the possibility of deadlocks?
>>
>> Nice catch, Tobias
>
> Does anybody have a different reaction than "Oh well, that's right, we
> cannot use the ExecutorService, then, but instead need to adapt the
> ThreadService so it handles this one right"?

Here, I have a different reaction:
As I pointed out, this has been done exactly right in Java 7 Fork/Join framework (by people who presumably put more thought and experience into it than we could).
If the majority insists that we reimplement something similar, then at least let us use the same interfaces.

best regards,
Tobias

>
> Ciao,
> Dscho
>
> --
> --
> Please avoid top-posting, and please make sure to reply-to-all!
>
> Mailing list web interface: http://groups.google.com/group/fiji-devel
>
> ---
> You received this message because you are subscribed to the Google Groups "Fiji-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.



_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

dscho
Administrator
In reply to this post by Tobias Pietzsch
Hi Tobi,

On Fri, 6 Dec 2013, Tobias Pietzsch wrote:

> As far as I see it, a big part of scijava-common is to be a dependency
> injection framework.

That is just part of it, but yes, it is a part.

> It can provide an application context, it can harvest annotations from
> my classes, it can even modify my eclipse projects.

It can add a file to your Eclipse project that works around an incorrect
interpretation of the Java specification, yes: annotation processing is
only performed correctly by Eclipse if you add that file.

> This is exactly the right thing to use if you are building an
> application.

And if you are building extensible frameworks with extension points, such
as: segmentation and tracking plugins for TrackMate, feature plugins for
the Trainable Segmentation, generic extensions for TrakEM2, commands for
the 3D Viewer, and yes, also operations for scijava-ops.

Extensibility is something that a lot of projects related to Fiji had to
reinvent due to lack of a base library providing a common extensibility
framework.

With scijava-common, there is no more need to do so.

I have to admit that it gets hard for me to accept that we are contesting
about reusability here. Reusable classes in a library that weighs in with
a whopping two hundred kilobytes.

If you insist on not having such a dependency in imglib2-core, fine, I
cannot change your mind, I can only point out that imglib2-core will have
to reinvent at least a large part of scijava-common's functionality, one
by one, and as ImgLib2 will be used for the most part in projects that
already rely at least transitively on scijava-common (including your own
big data viewer), that will be a true duplication.

I rest my case because there is nothing else I can do, really.
Dscho

_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

Lee Kamentsky
In reply to this post by Tobias Pietzsch
Hi Tobias,

On Fri, Dec 6, 2013 at 1:39 PM, Tobias Pietzsch <[hidden email]> wrote:
Hi,

On Dec 6, 2013, at 7:28 PM, Johannes Schindelin <[hidden email]> wrote:

> Hi Lee,
>
> On Fri, 6 Dec 2013, Lee Kamentsky wrote:
>>
>> On Fri, Dec 6, 2013 at 10:10 AM, Tobias Pietzsch <[hidden email]>wrote:
>>
>>> Problem: Did anyone think about the possibility of deadlocks?
>>
>> Nice catch, Tobias
>
> Does anybody have a different reaction than "Oh well, that's right, we
> cannot use the ExecutorService, then, but instead need to adapt the
> ThreadService so it handles this one right"?

Here, I have a different reaction:
As I pointed out, this has been done exactly right in Java 7 Fork/Join framework (by people who presumably put more thought and experience into it than we could).
If the majority insists that we reimplement something similar, then at least let us use the same interfaces.

It looks to me like a computation has to know whether or not it is being run inside a ForkJoinThread. If so, it calls ForkJoinTask.invoke() to complete the computation and if not it calls invoke on the pool. I don't think  this completely solves our problem - if X calls Y which invokes Z on the thread, then Z should invoke on the pool, but if X invokes Y which invokes Z, then Z should invoke on the ForkJoinTask. The mechanism should make its decision based on the thread identity and not burden the framework with keeping track of where it is. So I say that they didn't get it right and that we can do better.
best regards,
Tobias

>
> Ciao,
> Dscho
>
> --
> --
> Please avoid top-posting, and please make sure to reply-to-all!
>
> Mailing list web interface: http://groups.google.com/group/fiji-devel
>
> ---
> You received this message because you are subscribed to the Google Groups "Fiji-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.



_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

Stephan Saalfeld
In reply to this post by Curtis Rueden
Hi all,

what about having a Java 7 Fork/Join inspired ThreadService as an
independent component forming a dependency of SJC?  As SJC grows,
cutting it into pieces that can optionally be grouped by parent-poms is
probably due sooner or later anyways.  May be this is the day of birth
for SJCC (SJC-concurrency).

Tobias, thanks for bringing up the deadlock issue!

Best,
Stephan



On Fri, 2013-12-06 at 12:52 -0600, Curtis Rueden wrote:

> Hi all,
>
> After sleeping on it, I do not think SJC ThreadService is actually the way
> to go for providing better threading configuration of algorithms. But
> unlike Tobias, I do *not* think this is because SJC contexts are
> burdensome. Specifically, in response to Tobias's comment:
>
> > [SJC] is absolutely not the right thing if you are building a library in
> my opinion.
>
> I strongly disagree; SJC was created to benefit SCIFIO, and it has done so
> tremendously. I can elaborate on this if you like, but suffice to say that
> even libraries have configuration and state and services that they need to
> provide, and a unified context is an excellent way to do this. If you
> instead rely on statics, you will create something that cannot be extended.
> We learned this from painful experience with Bio-Formats!
>
> Anyway, the reason I think SJC ThreadService is wrong for this use case is
> due to another bullet point I mentioned earlier:
>
> > Easy to override threading behavior on a case-by-case basis
> > (i.e., pass different ExecutorServices to different algorithms).
>
> That is, SJC services are singletons, intended to store state for the
> application context. The threading configuration for running an algorithm
> might be different than that for another algorithm running in the same
> context. We need to be able to configure it individually. While
> ThreadService might be able to serve as a sensible default configuration
> (i.e., "just use SJC's ExecutorService by default"), it should not be
> forced on every algorithm.
>
> > As I pointed out, this has been done exactly right in Java 7 Fork/Join
> > framework (by people who presumably put more thought and experience
> > into it than we could). If the majority insists that we reimplement
> > something similar, then at least let us use the same interfaces.
>
> I agree that it would be great if we could use something that has been
> vetted by the greater Java community. However, this makes porting to
> non-JavaSE-7 platforms much more difficult. Tobias, I know you pointed out
> some java.util.concurrent leakages into ThreadService -- we weren't being
> overly careful about it yet -- but my point stands that SJC has the
> potential to make those sorts of ports easier if we put effort into it.
> More Jenkins builds against alternative Java implementations would be a
> great start toward that.
>
> Anyway, it's not that I *want* to reimplement something similar to Java 7's
> Fork/Join... it's that I don't have a good alternative if we also want to
> support those other scenarios. One possible compromise would be design our
> own agnostic interface, and provide an implementation in its own module
> which uses Java 7's Fork/Join. That way, it will be possible to provide an
> alternative implementation on platforms which don't support it. This sort
> of interface-driven extensible design is *exactly* what SJC seeks to make
> possible with its plugin framework.
>
> Regards,
> Curtis
>
>
> On Fri, Dec 6, 2013 at 12:39 PM, Tobias Pietzsch <[hidden email]>wrote:
>
> > Hi,
> >
> > On Dec 6, 2013, at 7:28 PM, Johannes Schindelin <
> > [hidden email]> wrote:
> >
> > > Hi Lee,
> > >
> > > On Fri, 6 Dec 2013, Lee Kamentsky wrote:
> > >>
> > >> On Fri, Dec 6, 2013 at 10:10 AM, Tobias Pietzsch <[hidden email]
> > >wrote:
> > >>
> > >>> Problem: Did anyone think about the possibility of deadlocks?
> > >>
> > >> Nice catch, Tobias
> > >
> > > Does anybody have a different reaction than "Oh well, that's right, we
> > > cannot use the ExecutorService, then, but instead need to adapt the
> > > ThreadService so it handles this one right"?
> >
> > Here, I have a different reaction:
> > As I pointed out, this has been done exactly right in Java 7 Fork/Join
> > framework (by people who presumably put more thought and experience into it
> > than we could).
> > If the majority insists that we reimplement something similar, then at
> > least let us use the same interfaces.
> >
> > best regards,
> > Tobias
> >
> > >
> > > Ciao,
> > > Dscho
> > >
> > > --
> > > --
> > > Please avoid top-posting, and please make sure to reply-to-all!
> > >
> > > Mailing list web interface: http://groups.google.com/group/fiji-devel
> > >
> > > ---
> > > You received this message because you are subscribed to the Google
> > Groups "Fiji-devel" group.
> > > To unsubscribe from this group and stop receiving emails from it, send
> > an email to [hidden email].
> > > For more options, visit https://groups.google.com/groups/opt_out.
> >
> >
>
> --


_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel
Reply | Threaded
Open this post in threaded view
|

Re: [fiji-devel] Imglib2: using threadpools in core algorithms

Tobias Pietzsch
In reply to this post by Curtis Rueden
Hi Curtis,

> [SJC] is absolutely not the right thing if you are building a library in my opinion.

I strongly disagree; SJC was created to benefit SCIFIO, and it has done so tremendously. I can elaborate on this if you like, but suffice to say that even libraries have configuration and state and services that they need to provide, and a unified context is an excellent way to do this. If you instead rely on statics, you will create something that cannot be extended. We learned this from painful experience with Bio-Formats!

Yes, could you please elaborate on this!
Maybe I should say that I have just recently (two days ago to be honest) started to seriously get into scijava-common and I like it very much so far. But I still don't see where it would benefit imglib2 core. I would like to have imglib2 be as close to purely functional as possible, no state, no side effects. As much as possible, at least.
Could you please explain where SCIFIO benefits from being stateful and where being stateful would probably help imglib?

best regards,
Tobias

On Dec 6, 2013, at 7:52 PM, Curtis Rueden <[hidden email]> wrote:

Hi all,

After sleeping on it, I do not think SJC ThreadService is actually the way to go for providing better threading configuration of algorithms. But unlike Tobias, I do *not* think this is because SJC contexts are burdensome. Specifically, in response to Tobias's comment:

> [SJC] is absolutely not the right thing if you are building a library in my opinion.

I strongly disagree; SJC was created to benefit SCIFIO, and it has done so tremendously. I can elaborate on this if you like, but suffice to say that even libraries have configuration and state and services that they need to provide, and a unified context is an excellent way to do this. If you instead rely on statics, you will create something that cannot be extended. We learned this from painful experience with Bio-Formats!

Anyway, the reason I think SJC ThreadService is wrong for this use case is due to another bullet point I mentioned earlier:

> Easy to override threading behavior on a case-by-case basis
> (i.e., pass different ExecutorServices to different algorithms).

That is, SJC services are singletons, intended to store state for the application context. The threading configuration for running an algorithm might be different than that for another algorithm running in the same context. We need to be able to configure it individually. While ThreadService might be able to serve as a sensible default configuration (i.e., "just use SJC's ExecutorService by default"), it should not be forced on every algorithm.

> As I pointed out, this has been done exactly right in Java 7 Fork/Join
> framework (by people who presumably put more thought and experience
> into it than we could). If the majority insists that we reimplement
> something similar, then at least let us use the same interfaces.

I agree that it would be great if we could use something that has been vetted by the greater Java community. However, this makes porting to non-JavaSE-7 platforms much more difficult. Tobias, I know you pointed out some java.util.concurrent leakages into ThreadService -- we weren't being overly careful about it yet -- but my point stands that SJC has the potential to make those sorts of ports easier if we put effort into it. More Jenkins builds against alternative Java implementations would be a great start toward that.

Anyway, it's not that I *want* to reimplement something similar to Java 7's Fork/Join... it's that I don't have a good alternative if we also want to support those other scenarios. One possible compromise would be design our own agnostic interface, and provide an implementation in its own module which uses Java 7's Fork/Join. That way, it will be possible to provide an alternative implementation on platforms which don't support it. This sort of interface-driven extensible design is *exactly* what SJC seeks to make possible with its plugin framework.

Regards,
Curtis


On Fri, Dec 6, 2013 at 12:39 PM, Tobias Pietzsch <[hidden email]> wrote:
Hi,

On Dec 6, 2013, at 7:28 PM, Johannes Schindelin <[hidden email]> wrote:

> Hi Lee,
>
> On Fri, 6 Dec 2013, Lee Kamentsky wrote:
>>
>> On Fri, Dec 6, 2013 at 10:10 AM, Tobias Pietzsch <[hidden email]>wrote:
>>
>>> Problem: Did anyone think about the possibility of deadlocks?
>>
>> Nice catch, Tobias
>
> Does anybody have a different reaction than "Oh well, that's right, we
> cannot use the ExecutorService, then, but instead need to adapt the
> ThreadService so it handles this one right"?

Here, I have a different reaction:
As I pointed out, this has been done exactly right in Java 7 Fork/Join framework (by people who presumably put more thought and experience into it than we could).
If the majority insists that we reimplement something similar, then at least let us use the same interfaces.

best regards,
Tobias

>
> Ciao,
> Dscho
>
> --
> --
> Please avoid top-posting, and please make sure to reply-to-all!
>
> Mailing list web interface: http://groups.google.com/group/fiji-devel
>
> ---
> You received this message because you are subscribed to the Google Groups "Fiji-devel" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
> For more options, visit https://groups.google.com/groups/opt_out.




_______________________________________________
ImageJ-devel mailing list
[hidden email]
http://imagej.net/mailman/listinfo/imagej-devel

signature.asc (465 bytes) Download Attachment