Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposing for adding extensive benchmarks to juliaimages.jl #936

Closed
krishna-vasudev opened this issue Jan 21, 2021 · 17 comments
Closed

Proposing for adding extensive benchmarks to juliaimages.jl #936

krishna-vasudev opened this issue Jan 21, 2021 · 17 comments

Comments

@krishna-vasudev
Copy link
Contributor

Hey all.I am Debraj bhal, an IT undergraduate at International institute of information Technology,Bhubaneshwar.
As i was going through the project ideas page https://julialang.org/jsoc/gsoc/images/ ,i came through this:

JuliaImages provides high-quality implementations of many algorithms; however, as yet there is no set of benchmarks that compare our code against that of other image-processing frameworks. Developing such benchmarks would allow us to advertise our strengths and/or identify opportunities for further improvement.

I would like to contribute to this .But somewhat confused from where to start.So I hope @timholy , @zygmuntszpak will guide me in this.

@timholy
Copy link
Member

timholy commented Jan 21, 2021

That would be fantastic! A good way to start would be to perform the same task in, e.g., JuliaImages, Scikit-image, and OpenCV and compare the performance. Of course there will be differences in implementation that affect more than just speed, but that's the general idea.As for which algorithms to check, that's pretty open ended. https://juliaimages.org/latest/api_comparison/ might be useful.

@krishna-vasudev
Copy link
Contributor Author

- So for Scikit images and opencv ,they need to be implemented through python?
- So while creating benchmarks,they need to run in julia through pycall pkg or not?
- And can you please refer some package becnchmarks of other packages,So i get a clear understanding of the procedure and format.

@timholy
Copy link
Member

timholy commented Jan 22, 2021

Scikit images, yes to python.

For opencv, there is https://docs.opencv.org/master/d8/da4/tutorial_julia.html, but it's a bit challenging to build right now. On one hand, that would be a reason to call it through python. On the other hand, that seems likely to change (the student who performed that work may apply to GSoC to make it work seamlessly with Julia's build system), so if you can locally build that then I think you could call OpenCV through Julia.

While I haven't looked at it closely myself, one package that does this kind of thing is https:/SciML/SciMLBenchmarks.jl.

@krishna-vasudev
Copy link
Contributor Author

For opencv i was having a look at the tutorial,I will try to set up it locally,and hope it will work good in windows.
-And for images.jl the benchmark that will be developed will be a separate package?
-i had an overlook of this: https:/SciML/SciMLBenchmarks.jl ,so some basic knowledge of html is necessary for implementing those benchmarks in a webpage,so i will get this also.

And @timholy ,so to start with i am exploring these tutorials,then i can develop them in a organized way. Any suggestion what will be proper path to move on with this?

@timholy
Copy link
Member

timholy commented Jan 22, 2021

It's also fine to do the webpage thing separately, the most important thing would be the benchmark framework.

As for a "proper path," if you have a more specific question I'll be happy to try to answer it, but there's not much more I can say in abstract terms.

@krishna-vasudev
Copy link
Contributor Author

krishna-vasudev commented Jan 22, 2021

Thank you sir for your support. Basically, while developing benchmarks we need to keep in mind that:

  • how same algorithms implemented by these different packages, perform differently.
  • keeping track of efficiency for different sizes of images.
  • which packages perform efficiently when image size increases.
  • Some insight into why algorithms behave for different implementations in different languages ,for ex: julia and python.
  • developing some notebooks for proper verification ,if some want to.
  • showing the strengths of juliaimages.

And i will get to know what more to add ,as i explore the benchmark frameworks of different packages.
@timholy anything you would like to add to the list for now.

@timholy
Copy link
Member

timholy commented Jan 22, 2021

That's a good list. We've seen some cases in the past where scikit image takes shortcuts that hurts the quality of the result, but of course that was some time ago and they may have fixed it by now.

We expect to have good performance in some cases, and worse in others. I know, for instance, that LoopVectorization will speed up imfilter several fold, I've just been waiting to set up some benchmarks so I can measure the impact.

@krishna-vasudev
Copy link
Contributor Author

Just going through the packages for now and developing the idea for the project. I was going through LoopVectorization.jl and how much i understood from that it can be added before for loops to change the speed by breaking into steps.So we need to add it to imfilter function inside the implementation or break it's implementation as a function as whole to every pixel and combine them in the program?

@timholy
Copy link
Member

timholy commented Jan 23, 2021

I wouldn't add LoopVectorization just yet; let's get the benchmarks in place, if possible, and then check the impact!

It will be a mix of just putting @avx in some places and in others doing a bit of rewriting. (Mostly, simplifying and letting @avx do the work.) But we also have to pay attention to its impact on latency, because the first time you run @avx it's much slower to compile. I'm in the middle right now of doing a pretty significant look at latency in some of the lower-level packages (color packages, array packages like OffsetArrays, PaddedViews, MosaicViews, and the foundational package ImageCore). I already did one round on ImageFiltering (JuliaImages/ImageFiltering.jl#201), but adding LoopVectorization will force another full analysis of this issue. It will likely cause significant regressions in latency, while of course speeding up the runtime. Because I'm quite sensitive to the impact of latency, I've even wondered if we should add a caching method to LoopVectorization.

In other words, switching to LoopVectorization is a much more subtle project than it seems and definitely not something I'd recommend if you're still somewhat new to Julia. (If you're not, and playing with SnoopCompile.jl doesn't intimidate you, then don't hold back!)

@krishna-vasudev
Copy link
Contributor Author

Although explored julia ,but still new, so i will first try to put up the benchmarks in place. While i was going through the Image Filtering package,i saw there are some benchmarks but they were in julia only,I didn't completely understand their significance.Can you brief a little?

@timholy
Copy link
Member

timholy commented Jan 23, 2021

Yes, they are to measure the speed of the Julia code (for example to compare older vs newer implementations). It would be great to test similar cases with other languages.

@krishna-vasudev
Copy link
Contributor Author

krishna-vasudev commented Jan 23, 2021

oho! I was going through some algorithms benchmarks, skimage has a good efficiency in some,while juliaimges in some.Now exploring all of them,soon i will develop the project as a whole in mind.

What type of contribution can i do now side by side till i have the whole framework in mind ,i am somewhat confused?Please guide.
Sir,just curious can't we add multilingual benchmark tables in our documentation,it will help us advertise our strengths better.

@krishna-vasudev
Copy link
Contributor Author

We've seen some cases in the past where scikit image takes shortcuts that hurts the quality of the result, but of course that was some time ago and they may have fixed it by now.

Can you brief a little about these shortcuts?

@timholy
Copy link
Member

timholy commented Jan 25, 2021

I was misremembering, it was PIL (I'm not sure about scikit-image): #855

skimage has a good efficiency in some,while juliaimges in some

That's very likely---much of python's image processing is of course using C libraries, so performance should be fine for certain things. Once we have comparisons, we'll know where we need to invest in our own optimization.

What type of contribution can i do now side by side till i have the whole framework in mind ,i am somewhat confused?

This is the hardest thing to figure out. Because I haven't worked on this myself, I haven't yet put any thought into how to organize & present the benchmarks. Did you get any inspiration from studying the SciMLBenchmarks? I haven't looked in detail myself.

I can look into it an give advice, but in a certain sense that might rob you of some of the most interesting part of this project. Often the most interesting thing is to come up with the design of a framework, not just "doing the work" (which can sometimes end up being repetitive). Consequently I encourage you to think about how you'd go about solving this problem yourself.

multilingual benchmark tables

That would be awesome! Again not something I've put any effort into, so I don't have a particular plan. Maybe discourse would be a good place to look for or start a conversation about translation frameworks?

@krishna-vasudev
Copy link
Contributor Author

krishna-vasudev commented Jan 25, 2021

Did you get any inspiration from studying the SciMLBenchmarks?

Yes, i was going through it. It has done things pretty similar to what i am thinking. While developing frameworks, in some we can follow them and in some, i have some different ideas.

Consequently I encourage you to think about how you'd go about solving this problem yourself.

Yes I will try to figure out how to solve this, i am currently working on this.

And i was having some issues building opencv with julia, so for time being i am calling it through pycall. But we can change it later if it works smooothly in julia i.e., if it's developer makes the required changes.

@krishna-vasudev
Copy link
Contributor Author

@timholy ,sir is their any blog where i can get some knowlegde about julia automations which can be implied in the benchmarking package for juliaimages.

@timholy
Copy link
Member

timholy commented Mar 12, 2021

There is https:/JuliaCI/BenchmarkTools.jl and https:/JuliaCI/PkgBenchmark.jl, if that's the kind of thing you mean. If you mean something else, maybe you can spell out more specifically what you mean by "julia automations"?

@JuliaImages JuliaImages locked and limited conversation to collaborators Mar 28, 2021

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants