422225 – Data copied from libreoffice calc to spreasheet gives wrong results

Bug 422225 - Data copied from libreoffice calc to spreasheet gives wrong results

Summary: Data copied from libreoffice calc to spreasheet gives wrong results

Status:	RESOLVED FIXED

Alias:	None

Product:	LabPlot2
Classification:	Applications
Component:	frontend (show other bugs)
Version:	2.7.0
Platform:	Kubuntu Linux

Importance:	NOR normal
Target Milestone:	---
Assignee:	Alexander Semke

URL:
Keywords:

Depends on:
Blocks:

Reported:	2020-05-29 13:16 UTC by Eduardo
Modified:	2020-05-31 09:35 UTC (History)
CC List:	2 users (show)

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:

Attachments
just a print screen (145.95 KB, image/png) 2020-05-29 13:16 UTC, Eduardo	Details
Copy and paste 0 from libreOffice calc (208.13 KB, image/png) 2020-05-30 13:49 UTC, Eduardo	Details
View All Add an attachment

Note You need to log in before you can comment on or make changes to this bug.

Description Eduardo 2020-05-29 13:16:25 UTC

Created attachment 128906 [details]
just a print screen

SUMMARY

I am trying to make a simple linear fit and the results are different if i insert the data manually or if i use ctrl c, ctrl v (copy and paste) the same data from the libreoffice calc.

Lets to an example:

Consider three points:

x: (0; 0,4; 0,5)
y: (0; 20; 25)


STEPS TO REPRODUCE
1. Insert the data manually into spreadsheet
2. Spreedsheet -> plot data -> xy-curve
3. Analysis -> Fit, make a linear fit with y = c0 + c1 x
4. Results should be: c0 = 0,24 and c1 = 50,89 
5. Close and open a new project
6. Now insert the data (x,y) into libreOffice calc, copy (from libreoffice calc) and paste into spreadsheet (labplot) and make a linear fit just as before.
7. A first bug occurs when a paste the x data and labplot recognize as integer. I had to configure the data type as numeric and automatic(g) to succeed pasting x data. This is not the main problem i wish to report.
8. The results to the linear fit are: c0 = 0,002 and c1 = 51,42 that is different.


OBSERVED RESULT


i observed that on pasting data labplot fill lines below (line 4, line 5, ...) with 0 and the fit is taking it as data also.

Here in Brazil we use "," to separet decimal numbers and LabPlot looks to read it well. There is no problem with that, to me.

LibreOffice Version: 6.4.3.2

EXPECTED RESULT


SOFTWARE/OS VERSIONS
Windows: 
macOS: 
Linux/KDE Plasma: 20.04 lts
(available in About System)
KDE Plasma Version: 
KDE Frameworks Version: 
Qt Version: 

ADDITIONAL INFORMATION

Comment 1 Eduardo 2020-05-29 13:24:24 UTC

the value of c0 were so small and then i changed the input data to

x: (0; 0,37; 0,5)
y: (0; 20; 25)

and forget to correct on the first message. Consider x = 0,37 as in the print screen in attachment

Comment 2 disuser 2020-05-29 19:32:51 UTC

I have just checked it with the data you provided:

x: (0; 0,37; 0,5)
y: (0; 20; 25)

There are two variables (columns) of Numeric type (x, y) with three cases (rows) each. And I got virtually the same results in both LabPlot and in Libreoffice:

LabPlot:
c₀ = 0.00241456±0.011553 (478 %)
 (t statistic: 0.209, p value: 0.869, conf. interval: -0.14438 .. 0.149209)
c₁ = 0.508915±0.0321703 (6.32 %)
 (t statistic: 15.8, p value: 0.0402, conf. interval: 0.100152 .. 0.917678)

Libreoffice:
c₀ = 0.00241456 = intercept(y,x)
c₁ = 0.50891530 = slope(y,x)

However, if I don't reduce the default number of cases (rows) from one hundred to three, LabPlot gives me the following estimates:

Parameters:
c₀ = 2.57031e-05±0.000122982 (478 %)
 (t statistic: 0.209, p value: 0.835, conf. interval: -0.000218351 .. 0.000269757)
c₁ = 0.514287±0.00197717 (0.384 %)
 (t statistic: 260, p value: 0, conf. interval: 0.510363 .. 0.518211)

Comment 3 Alexander Semke 2020-05-29 19:40:22 UTC

(In reply to Eduardo from comment #1)
> the value of c0 were so small and then i changed the input data to
> 
> x: (0; 0,37; 0,5)
> y: (0; 20; 25)
> 
> and forget to correct on the first message. Consider x = 0,37 as in the
> print screen in attachment

We use the first row of the data to be pasted to recognize the data type. In this case we have 0 here and we set the column types to integer. After this the conversion of 0,37 and of 0,5 to integer fails and you get 0 here. For all the remaining rows you also get 0 - this is because of another bug where we don't properly differentiate between "empty integer values" and "zero integer value", this will be solved in v2.9.

To solve the actual problem, just use 0,0 for the first row:

0,0	0,0
0,37	20,0
0,5	25,0

With this you'll get numeric columns with three rows only and the fit will produce the correct results.

Comment 4 disuser 2020-05-29 19:46:00 UTC

Perhaps the default number of rows should be better set to 1 with a capability to auto-expand upon entering or pasting data?

Comment 5 Stefan Gerlach 2020-05-29 19:49:46 UTC

2.8.0 gives the correct results:

c₀ = 0.241456±1.1553 (478 %)
 (t statistic: 0.209, p value: 0.869, conf. interval: -14.438 .. 14.9209)
c₁ = 50.8915±3.21703 (6.32 %)
 (t statistic: 15.8, p value: 0.0402, conf. interval: 10.0152 .. 91.7678)

with 3 and 100 columns. The conf. interval though seems suspicious. I will check this.

@disuser@disroot.org: Can you check your input data?

Comment 6 disuser 2020-05-29 20:06:42 UTC

Actually, I meant 100 rows (the default number), not columns. The input data was as described above. I have to remove the redundant rows, otherwise I'll get a wrong result. That is why I have suggested an auto-expansion of spreadsheets upon entering or pasting data.

Comment 7 Eduardo 2020-05-30 13:49:23 UTC

Created attachment 128931 [details]
Copy and paste 0 from libreOffice calc

fill all row with 0

Comment 8 Eduardo 2020-05-30 13:59:13 UTC

Hi people, thanks to the comments

(In reply to disuser from comment #2)
> I have just checked it with the data you provided:
> 
> x: (0; 0,37; 0,5)
> y: (0; 20; 25)
> 
> There are two variables (columns) of Numeric type (x, y) with three cases
> (rows) each. And I got virtually the same results in both LabPlot and in
> Libreoffice:
> 
> LabPlot:
> c₀ = 0.00241456±0.011553 (478 %)
>  (t statistic: 0.209, p value: 0.869, conf. interval: -0.14438 .. 0.149209)
> c₁ = 0.508915±0.0321703 (6.32 %)
>  (t statistic: 15.8, p value: 0.0402, conf. interval: 0.100152 .. 0.917678)
> 
> Libreoffice:
> c₀ = 0.00241456 = intercept(y,x)
> c₁ = 0.50891530 = slope(y,x)
> 
> However, if I don't reduce the default number of cases (rows) from one
> hundred to three, LabPlot gives me the following estimates:
> 
> Parameters:
> c₀ = 2.57031e-05±0.000122982 (478 %)
>  (t statistic: 0.209, p value: 0.835, conf. interval: -0.000218351 ..
> 0.000269757)
> c₁ = 0.514287±0.00197717 (0.384 %)
>  (t statistic: 260, p value: 0, conf. interval: 0.510363 .. 0.518211)

disuser you probably have used the values

x: (0; 0,37; 0,5)
y: (0; 0,20; 0,25)

But the problem that i can see, as reported by Alexander Semke, is 

1. first row recognize data type as integer and change x = 0,37 by x = 0

2. fill all rows with 0 (look a print screen in attachment) and does not matter what kind of data type is in the following.

Should i change the status to resolved?

Comment 9 Alexander Semke 2020-05-31 09:35:46 UTC

(In reply to Eduardo from comment #8)
> But the problem that i can see, as reported by Alexander Semke, is 
> 
> 1. first row recognize data type as integer and change x = 0,37 by x = 0
> 
> 2. fill all rows with 0 (look a print screen in attachment) and does not
> matter what kind of data type is in the following.
> 
> Should i change the status to resolved?
Yes, let's set it to resolved. The second problem with zero values instead of empty values is already tracked on our side and will be fixed in 2.9.