git.net

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Statistics] Port codes from Commons Math


Hello all,

As I proposed early I would like to begin port code from Commons-math
<https://github.com/apache/commons-math> to Commons-statistics
<https://github.com/apache/commons-statistics>.
(For further details refer my  GSoC Proposal
<https://docs.google.com/document/d/1sqSa0hrYc2AD75RZyJRkeqCOBOqTOeMnPaBsE9U5YhU/edit?usp=sharing>
though I'm not selected this year)

This is my proposed architecture in brief

   1. Commons-Statistics-Core => Frequency and StatUtils classes (Can add
   more common classes while implementing)
   2. Commons-Statistics-Correlation
   3. Commons-Statistics-Descriptive
   4. Commons-Statistics-Inference
   5. Commons-Statistics-Interval
   6. Commons-Statistics-Ranking
   7. Commons-Statistics-Regression

While I referring Commons-Geometry ported code to get a head start , I
found that each module inside, contain a pox.xml file. Are they implemented
as separate projects and then group in the same package? I'm asking because
Since I'm new to code porting :-).

If so in here should I create all 7 projects and then group those in same
project. Firstly I suppose to start port Ranking Module as it has less
dependencies comparing to others.

Would someone help me to get a head start ??

Best Regards,
Gimhana.


On 14 April 2018 at 14:24, Gimhana Nadeeshan <
gimhanadesilva.15@xxxxxxxxxxxxx> wrote:

> Hello devs,
>
> *Covariance stats=
>> > IntStream.of(1,2,3).collect(Covariance::new,Covariance::acce
>> pt,Covariance::combine);*
>>
>>
>> Can you explain a bit more what is happening with the method references
>> "accept" and "combine"?
>>
>
> The mutable reduction operation - collect() accumulates input elements
> into a mutable result container, such as a Collection. It requires 3
> functions. A *supplier function* construct new instance of the result
> container. An *accumulator function *incorporate an input element into a
> result container and a *combining function* to merge the contents of one
> result container into another.
>
> So the accept() method, Records a new value into the result container.
> (Here Covariance Object). Accepting the values in the Stream, to the
> Covariance Object. It is the functionality of the functional interface I'm
> going to implement to make use the Lambda Expressions of Java8.
>
> combine()  method will combine the state of another Covariance Object
> into this one. It merges the results of one results container to another.
> Generation of new object is replaced by Replacing.
>
> As a whole the meaning of those implementation is like generating a single
> string object by concatenating strings in an array list. All the
> statistical functionalities are served as a state object in this
> implementation.
>
> *Week 2: Begin porting the code according to the dependency hierarchy
>> > identified. *
>> >
>>
>> Sorry but I cannot see where you identify the dependency hierarchy. Are
>> you
>> referring to your diagram?
>
>
> Dependency Hierarchy is not mentioned separately in the proposal. But I
> have created the Time-line of the proposed project according to that. Less
> dependent modules are porting at the beginning and gradually going for the
> more coupled ones. So at that point of view I am going to port Ranking
> Module at the beginning and gradually port Interval,Regression,
> Descriptive,Correlation,Interference modules and so on.
>
> A further comment: L1-type statistics such as median and quantiles can also
>> be included in the API by using the stream.sorted() method to sort the
>> stream first.
>>
>> While it is true medians can be in the aggregate sped up by partitioning
>> algorithms, I think making use of built-in methods like sorted() is still
>> likely to produce the best and most consistent performance with the JVM.
>
>
> Definitely. Using built-in-methods provided, will make the package
> performance and the ease of use and using inbuilt-methods where is possible
> is one of the main goals of the proposed project.
>
> Best Regards,
> Gimhana.
>
>
> Nadeeshan Gimhana
>
> Batch Representative (15' batch)
>
> Department of Computer Science & Engineering
>
> University of Moratuwa
>
> *Mobile :+94775744613*
>
>
> *Website : https://ngimhana94.wixsite.com/gimhanadesilva/
> <https://ngimhana94.wixsite.com/gimhanadesilva/>*
>
> *L**inkedin **:www.linkedin.com/in/nadeeshangimhana/
> <http://www.linkedin.com/in/nadeeshangimhana/>*
>
>
> * <http://www.linkedin.com/in/nadeeshangimhana/>*
>
>
> * <http://www.linkedin.com/in/nadeeshangimhana/>*
>
>
>
> On 13 April 2018 at 12:26, Eric Barnhill <ericbarnhill@xxxxxxxxx> wrote:
>
>> A further comment: L1-type statistics such as median and quantiles can
>> also
>> be included in the API by using the stream.sorted() method to sort the
>> stream first.
>>
>> While it is true medians can be in the aggregate sped up by partitioning
>> algorithms, I think making use of built-in methods like sorted() is still
>> likely to produce the best and most consistent performance with the JVM.
>>
>> On Thu, Apr 12, 2018 at 2:03 PM, Eric Barnhill <ericbarnhill@xxxxxxxxx>
>> wrote:
>>
>> > HI Gimhana,
>> >
>> > Sorry for the delay in response, but you posted this right before our
>> > two-week Easter holiday, for which I was completely absent ; then I
>> needed
>> > a few days back at work to clean up all the mess. :)
>> >
>> > Your overall goals look good to me. You have gone right to the heart of
>> > the matter and propose to reinvent the statistics tools to make good
>> use of
>> > the Java 8 API. I think that's great and you should get started. Your
>> goal
>> > of eliminating dependencies on Commons-Math is also right.
>> >
>> > I noticed this in the proposal:
>> >
>> > *Covariance stats=
>> >> IntStream.of(1,2,3).collect(Covariance::new,Covariance::acce
>> pt,Covariance::combine);*
>> >
>> >
>> > Can you explain a bit more what is happening with the method references
>> > "accept" and "combine"?
>> >
>> > Also this
>> >
>> > *Week 2: Begin porting the code according to the dependency hierarchy
>> >> identified. *
>> >>
>> >
>> > Sorry but I cannot see where you identify the dependency hierarchy. Are
>> > you referring to your diagram?
>> >
>> > Eric
>> >
>> >
>> > On Mon, Mar 26, 2018 at 8:07 AM, Gimhana Nadeeshan <
>> > gimhanadesilva.15@xxxxxxxxxxxxx> wrote:
>> >
>> >> Hello devs,
>> >>
>> >> I have updated my draft proposal (Port codes from Commons Math
>> >> <https://docs.google.com/document/d/1sqSa0hrYc2AD75RZyJRkeqC
>> >> OBOqTOeMnPaBsE9U5YhU/edit?usp=sharing>)
>> >> -Timeline added; before submitting the final at the Google site. Feel
>> free
>> >> to comment and give feedback to improve it.
>> >>
>> >> Best Regards,
>> >> Gimhana.
>> >>
>> >> On 24 March 2018 at 17:35, Gimhana Nadeeshan <
>> >> gimhanadesilva.15@xxxxxxxxxxxxx> wrote:
>> >>
>> >> > Hello devs,
>> >> >
>> >> >
>> >> >> Note that some of the repositories included in that screen do
>> >> >> not belong to "Commons":
>> >> >>  * sling-*
>> >> >>  * webservices-*
>> >> >>  * xml-*
>> >> >
>> >> >
>> >> > I'm working on it.(Still research on Kibble :-) )
>> >> >
>> >> > Botched alignments...
>> >> >> "cloc" has several output formats from which you could produce
>> >> >> nicer tables.
>> >> >
>> >> >
>> >> > I'm extremely sorry. I'll fix it asap.
>> >> >
>> >> > Best Regards,
>> >> > Gimhana
>> >> >
>> >> > On 23 March 2018 at 17:43, Gilles <gilles@xxxxxxxxxxxxxxxxxxxxx>
>> wrote:
>> >> >
>> >> >> Hi Gimhana.
>> >> >>
>> >> >> On Thu, 22 Mar 2018 22:11:31 +0530, Gimhana Nadeeshan wrote:
>> >> >>
>> >> >>> Hello devs,
>> >> >>>
>> >> >>> By gone through @Gilles suggestions I found very interesting facts
>> >> about
>> >> >>> Commons projects.
>> >> >>>
>> >> >>> Feel free to check Kibble reports
>> >> >>>
>> >> >>> <https://demo.kibble.apache.org/dashboard.html?page=repos&su
>> >> >>> bfilter=commons&author=true&from=1458585000&to=1521743399>
>> >> >>> regarding these projects. It will be given a clear picture on the
>> >> >>> progress
>> >> >>> of projects.In the Commons Projects side it seems visible growth of
>> >> >>> contributors and releases.
>> >> >>>
>> >> >>
>> >> >> Note that some of the repositories included in that screen do
>> >> >> not belong to "Commons":
>> >> >>  * sling-*
>> >> >>  * webservices-*
>> >> >>  * xml-*
>> >> >>
>> >> >> There should be a way to filter them out.
>> >> >>
>> >> >> And I created a simple doc using the data collected from CLOC tool
>> to
>> >> get
>> >> >>> an idea of commons projects. I think This kind of document will
>> help
>> >> new
>> >> >>> volunteers to get a rough idea of the scope and the current status
>> of
>> >> >>> projects before go deeper.Histogram of Commons Projects.
>> >> >>>
>> >> >>> <https://docs.google.com/document/d/1qPWWnA9hWgKytLWI3A3rXu4
>> >> >>> 7V8LSglgsV5hBxVnLiCI/edit?usp=sharing>
>> >> >>>
>> >> >>
>> >> >> Botched alignments...
>> >> >> "cloc" has several output formats from which you could produce
>> >> >> nicer tables.
>> >> >>
>> >> >> Regards,
>> >> >> Gilles
>> >> >>
>> >> >>
>> >> >>
>> >> >> ------------------------------------------------------------
>> ---------
>> >> >> To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxx
>> >> >> For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxx
>> >> >>
>> >> >>
>> >> >
>> >>
>> >
>> >
>>
>
>