Discussion:
[Scilab-users] HDF5 save is super slow
Arvid Rosén
2018-10-15 08:11:15 UTC
Permalink
Dear Scilab list,

We have been using Scilab since version 3 at my company. Migrating from one version to another has always been some work, but going from 5 to 6 seems to be the most difficult so far.

One of the problems for us, is the new HDF5 format for loading and saving. We have a huge number of old data sets that we need to keep working on, and most of them contain large number of state-space systems stored in lists. However, loading of saving these data sets is extremely slow compared to the old binary format. So slow in fact, that using Scilab 6 is impossible for us at the moment. I have a short test case that demonstrates the problem:

/////////////////////////////////
N = 4;
n = 10000;

filters = list();

for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = G;
end

tic();
save('filters.dat', filters);
ts1 = toc();

tic();
save('filters.dat', 'filters');
ts2 = toc();

printf("old save %.2fs\n", ts1);
printf("new save %.2fs\n", ts2);
printf("slowdown %.1f\n", ts2/ts1);
/////////////////////////////////

Obviousle, the code above need to run in scilab 5, as it uses both new and old methods for saving the list of state-space filters. And to be fair, HDF5 saving a bit faster in scilab 6, but still orders of magnitude slower than the old format in scilab 5. As a reference, below is the output on my pretty fast Mac running scilab 5:

Warning: Scilab 6 will not support the file format used.
Warning: Please quote the variable declaration. Example, save('myData.sod',a) becomes save('myData.sod','a').
Warning: See help('save') for the rational.
Warning: file 'filters.dat' already opened in Scilab.
old save 0.03s
new save 20.93s
slowdown 775.0

So my questions:
Can and will this be addressed in future versions of Scilab?
Can I store large number of state-space systems in another way to make this faster?

Best Regards,
Arvid
a***@laas.fr
2018-10-15 09:07:49 UTC
Permalink
Hello,

I tried your code in 5.5.1 and the last nightly-build of 6.0: I see a
slowdown of around 175 between old save in 5.5.1 and new (and only) save
in 6.0.
It's really related to the data structure, because we use hdf5
read/write a lot here and did not experience significant slowdowns using
6.0.
I think the overhead might come to the translation of your fairly
complex variable (a long array of tlist) in the corresponding hdf5
structure.
In the old save, this translation was not necessary.
Maybe you could try to save your data in a different way.
For example:
3) you could save each element of "filters" in a separate file.
2) you could bypass save and directly write your data in a hdf5 file by
using h5open(), h5write() directly. It means you need to write your own
load() for your custom file format. But this way, you can try to find
the best way to layout your data in hdf5 format.
3) in addition to 2) you could try to save each entry of your "filters"
array as one dataset in a given hdf5 file.

Did you search on bugzilla whether this bug was already submitted?
Could you try to report it?


Antoine
Post by Arvid Rosén
/////////////////////////////////
N = 4;
n = 10000;
filters = list();
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end
tic();
save('filters.dat', filters);
ts1 = toc();
tic();
save('filters.dat', 'filters');
ts2 = toc();
printf("old save %.2fs\n", ts1);
printf("new save %.2fs\n", ts2);
printf("slowdown %.1f\n", ts2/ts1);
/////////////////////////////////
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE

Tel:+33 5 61 33 64 59

email : ***@laas.fr
permanent email : ***@polytechnique.org

+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Arvid Rosén
2018-10-15 09:55:20 UTC
Permalink
Hi,

Thanks for getting back to me!

Unfortunately, we used Scilab’s pretty cool way of doing object orientation, so we have big nested tlist structures with multiple instances of various lists of filters and other structures, as in my example. Saving those structures in some explicit manual way would be extremely complicated. Or is there some way of writing explicit HDF5 saving/loading schemes using overloading? That would be great! I am sure we could find the main culprits and do something explicit for them, but as they can be located wherever in a big nested structure, it would be painful to do anything on the top level.

Another, related I guess, problem here is that the new file format uses about 15 times as much disk space as the old format (for a typical ill-behaved nested structure). That adds to the save/load time too I guess, but is probably not the main source here.

I think I might have reported this earlier using Bugzilla, but I’m not sure. I’ll check and report it if not.

Cheers,
Arvid

From: users <users-***@lists.scilab.org> on behalf of "***@laas.fr" <***@laas.fr>
Reply-To: "***@laas.fr" <***@laas.fr>, Users mailing list for Scilab <***@lists.scilab.org>
Date: Monday, 15 October 2018 at 11:08
To: "***@lists.scilab.org" <***@lists.scilab.org>
Subject: Re: [Scilab-users] HDF5 save is super slow

Hello,

I tried your code in 5.5.1 and the last nightly-build of 6.0: I see a slowdown of around 175 between old save in 5.5.1 and new (and only) save in 6.0.
It's really related to the data structure, because we use hdf5 read/write a lot here and did not experience significant slowdowns using 6.0.
I think the overhead might come to the translation of your fairly complex variable (a long array of tlist) in the corresponding hdf5 structure.
In the old save, this translation was not necessary.
Maybe you could try to save your data in a different way.
For example:
3) you could save each element of "filters" in a separate file.
2) you could bypass save and directly write your data in a hdf5 file by using h5open(), h5write() directly. It means you need to write your own load() for your custom file format. But this way, you can try to find the best way to layout your data in hdf5 format.
3) in addition to 2) you could try to save each entry of your "filters" array as one dataset in a given hdf5 file.

Did you search on bugzilla whether this bug was already submitted?
Could you try to report it?


Antoine

Le 15/10/2018 à 10:11, Arvid Rosén a écrit :
/////////////////////////////////
N = 4;
n = 10000;

filters = list();

for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = G;
end

tic();
save('filters.dat', filters);
ts1 = toc();

tic();
save('filters.dat', 'filters');
ts2 = toc();

printf("old save %.2fs\n", ts1);
printf("new save %.2fs\n", ts2);
printf("slowdown %.1f\n", ts2/ts1);
/////////////////////////////////
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++



Antoine Monmayrant LAAS - CNRS

7 avenue du Colonel Roche

BP 54200

31031 TOULOUSE Cedex 4

FRANCE



Tel:+33 5 61 33 64 59



email : ***@laas.fr<mailto:***@laas.fr>

permanent email : ***@polytechnique.org<mailto:***@polytechnique.org>



+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Antoine Monmayrant
2018-10-15 10:22:58 UTC
Permalink
Post by Arvid Rosén
Hi,
Thanks for getting back to me!
Unfortunately, we used Scilab’s pretty cool way of doing object
orientation, so we have big nested tlist structures with multiple
instances of various lists of filters and other structures, as in my
example. Saving those structures in some explicit manual way would be
extremely complicated. Or is there some way of writing explicit HDF5
saving/loading schemes using overloading? That would be great! I am
sure we could find the main culprits and do something explicit for
them, but as they can be located wherever in a big nested structure,
it would be painful to do anything on the top level.
Another, related I guess, problem here is that the new file format
uses about 15 times as much disk space as the old format (for a
typical ill-behaved nested structure). That adds to the save/load time
too I guess, but is probably not the main source here.
Argh, yes, I tested it and in your example, I have a file x8.5 bigger.
I think that both increases in time and size are real issues and should
be reported as bugs.

By the way, I rewrote your script to run it under both 6.0 and 5.5:

/////////////////////////////////
N = 4;
n = 10000;
filters = list();

for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end

ver=getversion('scilab');

if ver(1)<6 then
    tic();
    save('filters_old.dat', filters);
    ts1 = toc();
else
    tic();
    save('filters_new.dat', 'filters');
    ts1 = toc();
end

printf("Time for save %.2fs\n", ts1);
/////////////////////////////////

Hope it helps,

Antoine
Post by Arvid Rosén
I think I might have reported this earlier using Bugzilla, but I’m not
sure. I’ll check and report it if not.
Cheers,
Arvid
*Date: *Monday, 15 October 2018 at 11:08
*Subject: *Re: [Scilab-users] HDF5 save is super slow
Hello,
I tried your code in 5.5.1 and the last nightly-build of 6.0: I see a
slowdown of around 175 between old save in 5.5.1 and new (and only)
save in 6.0.
It's really related to the data structure, because we use hdf5
read/write a lot here and did not experience significant slowdowns
using 6.0.
I think the overhead might come to the translation of your fairly
complex variable (a long array of tlist) in the corresponding hdf5
structure.
In the old save, this translation was not necessary.
Maybe you could try to save your data in a different way.
3) you could save each element of "filters" in a separate file.
2) you could bypass save and directly write your data in a hdf5 file
by using h5open(), h5write() directly. It means you need to write your
own load() for your custom file format. But this way, you can try to
find the best way to layout your data in hdf5 format.
3) in addition to 2) you could try to save each entry of your
"filters" array as one dataset in a given hdf5 file.
Did you search on bugzilla whether this bug was already submitted?
Could you try to report it?
Antoine
/////////////////////////////////
N = 4;
n = 10000;
filters = list();
for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = G;
end
tic();
save('filters.dat', filters);
ts1 = toc();
tic();
save('filters.dat', 'filters');
ts2 = toc();
printf("old save %.2fs\n", ts1);
printf("new save %.2fs\n", ts2);
printf("slowdown %.1f\n", ts2/ts1);
/////////////////////////////////
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE
Tel:+33 5 61 33 64 59
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE

Tel:+33 5 61 33 64 59

email : ***@laas.fr
permanent email : ***@polytechnique.org

+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Stéphane Mottelet
2018-10-15 12:36:12 UTC
Permalink
Hello,

I looked a little bit in the sources: the evident bottleneck is the
nested creation of an hdf5 group each time that a container variable is met.
For the given example, this is particularly evident. If you replace the
syslin structure by the corresponding [A,B;C,D] matrix, then save is ten
times faster:

N = 4;
n = 1000;
filters = list();
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());

   0.724754

N = 4;
n = 1000;
filters = list()
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = [G.a G.b;G.c G.d];
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());

   0.082302

Serializing container objects seems to be the solution, but it goes
towards an orthogonal direction w.r.t. the hdf5 portability spirit.

S.
Post by Antoine Monmayrant
Post by Arvid Rosén
Hi,
Thanks for getting back to me!
Unfortunately, we used Scilab’s pretty cool way of doing object
orientation, so we have big nested tlist structures with multiple
instances of various lists of filters and other structures, as in my
example. Saving those structures in some explicit manual way would be
extremely complicated. Or is there some way of writing explicit HDF5
saving/loading schemes using overloading? That would be great! I am
sure we could find the main culprits and do something explicit for
them, but as they can be located wherever in a big nested structure,
it would be painful to do anything on the top level.
Another, related I guess, problem here is that the new file format
uses about 15 times as much disk space as the old format (for a
typical ill-behaved nested structure). That adds to the save/load
time too I guess, but is probably not the main source here.
Argh, yes, I tested it and in your example, I have a file x8.5 bigger.
I think that both increases in time and size are real issues and
should be reported as bugs.
/////////////////////////////////
N = 4;
n = 10000;
filters = list();
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end
ver=getversion('scilab');
if ver(1)<6 then
    tic();
    save('filters_old.dat', filters);
    ts1 = toc();
else
    tic();
    save('filters_new.dat', 'filters');
    ts1 = toc();
end
printf("Time for save %.2fs\n", ts1);
/////////////////////////////////
Hope it helps,
Antoine
Post by Arvid Rosén
I think I might have reported this earlier using Bugzilla, but I’m
not sure. I’ll check and report it if not.
Cheers,
Arvid
*Date: *Monday, 15 October 2018 at 11:08
*Subject: *Re: [Scilab-users] HDF5 save is super slow
Hello,
I tried your code in 5.5.1 and the last nightly-build of 6.0: I see a
slowdown of around 175 between old save in 5.5.1 and new (and only)
save in 6.0.
It's really related to the data structure, because we use hdf5
read/write a lot here and did not experience significant slowdowns
using 6.0.
I think the overhead might come to the translation of your fairly
complex variable (a long array of tlist) in the corresponding hdf5
structure.
In the old save, this translation was not necessary.
Maybe you could try to save your data in a different way.
3) you could save each element of "filters" in a separate file.
2) you could bypass save and directly write your data in a hdf5 file
by using h5open(), h5write() directly. It means you need to write
your own load() for your custom file format. But this way, you can
try to find the best way to layout your data in hdf5 format.
3) in addition to 2) you could try to save each entry of your
"filters" array as one dataset in a given hdf5 file.
Did you search on bugzilla whether this bug was already submitted?
Could you try to report it?
Antoine
/////////////////////////////////
N = 4;
n = 10000;
filters = list();
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end
tic();
save('filters.dat', filters);
ts1 = toc();
tic();
save('filters.dat', 'filters');
ts2 = toc();
printf("old save %.2fs\n", ts1);
printf("new save %.2fs\n", ts2);
printf("slowdown %.1f\n", ts2/ts1);
/////////////////////////////////
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE
Tel:+33 5 61 33 64 59
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE
Tel:+33 5 61 33 64 59
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
_______________________________________________
users mailing list
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la MatiÚre Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de CompiÚgne
CS 60319, 60203 CompiÚgne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet
Arvid Rosén
2018-10-15 13:07:08 UTC
Permalink
This post might be inappropriate. Click to display it.
Stéphane Mottelet
2018-10-15 13:35:04 UTC
Permalink
Post by Arvid Rosén
Hi,
Yeah, that makes sense. Or, it was about what I expected at least. It
is a pity though, as handling thousands of filters isn’t necessarily a
strange thing to do with a software like Scilab, and making a special
serialization like that would be nothing less than a hack.
Do you think there is a way forward under the hood that could make big
deep list structures >10x faster in the future?
No. I think that hdf5 is not convenient for deeply structured data with
small leafs. Some interesting discussions can be found here:

https://cyrille.rossant.net/should-you-use-hdf5/
https://cyrille.rossant.net/moving-away-hdf5/

If you just need to read/write within your own software, serializing
should not be an issue. In the example you gave, the structure of each
leaf is always the same: using an array of structs improves performances
a little bit:

clear
N = 4;
n = 1000;
for i=1:n
   G(i).a=rand(N,N);
   G(i).b=rand(N,1);
   G(i).c=rand(1,N);
   G(i).c=rand(1,1);
end
tic();
save('filters.dat', 'G');
disp(toc());
--> disp(toc());

   0.24133


S.
Post by Arvid Rosén
Otherwise, the whole object orientation part of Scilab (tlist and
mlist etc.) would be hard to use for anything that comes in large
numbers, which would be a shame, especially as it used to work just
fine (well, I can see how the old structure wasn’t “just fine” in
other ways, but still).
Cheers,
Arvid
*Organization: *Université de Technologie de CompiÚgne
*Date: *Monday, 15 October 2018 at 14:37
*Subject: *Re: [Scilab-users] HDF5 save is super slow
Hello,
I looked a little bit in the sources: the evident bottleneck is the
nested creation of an hdf5 group each time that a container variable is met.
For the given example, this is particularly evident. If you replace
the syslin structure by the corresponding [A,B;C,D] matrix, then save
N = 4;
n = 1000;
filters = list();
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());
   0.724754
N = 4;
n = 1000;
filters = list()
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = [G.a G.b;G.c G.d];
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());
   0.082302
Serializing container objects seems to be the solution, but it goes
towards an orthogonal direction w.r.t. the hdf5 portability spirit.
S.
Hi,
Thanks for getting back to me!
Unfortunately, we used Scilab’s pretty cool way of doing
object orientation, so we have big nested tlist structures
with multiple instances of various lists of filters and other
structures, as in my example. Saving those structures in some
explicit manual way would be extremely complicated. Or is
there some way of writing explicit HDF5 saving/loading schemes
using overloading? That would be great! I am sure we could
find the main culprits and do something explicit for them, but
as they can be located wherever in a big nested structure, it
would be painful to do anything on the top level.
Another, related I guess, problem here is that the new file
format uses about 15 times as much disk space as the old
format (for a typical ill-behaved nested structure). That adds
to the save/load time too I guess, but is probably not the
main source here.
Argh, yes, I tested it and in your example, I have a file x8.5 bigger.
I think that both increases in time and size are real issues and
should be reported as bugs.
/////////////////////////////////
N = 4;
n = 10000;
filters = list();
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end
ver=getversion('scilab');
if ver(1)<6 then
    tic();
    save('filters_old.dat', filters);
    ts1 = toc();
else
    tic();
    save('filters_new.dat', 'filters');
    ts1 = toc();
end
printf("Time for save %.2fs\n", ts1);
/////////////////////////////////
Hope it helps,
Antoine
I think I might have reported this earlier using Bugzilla, but
I’m not sure. I’ll check and report it if not.
Cheers,
Arvid
*Date: *Monday, 15 October 2018 at 11:08
*Subject: *Re: [Scilab-users] HDF5 save is super slow
Hello,
I see a slowdown of around 175 between old save in 5.5.1 and
new (and only) save in 6.0.
It's really related to the data structure, because we use hdf5
read/write a lot here and did not experience significant
slowdowns using 6.0.
I think the overhead might come to the translation of your
fairly complex variable (a long array of tlist) in the
corresponding hdf5 structure.
In the old save, this translation was not necessary.
Maybe you could try to save your data in a different way.
3) you could save each element of "filters" in a separate file.
2) you could bypass save and directly write your data in a
hdf5 file by using h5open(), h5write() directly. It means you
need to write your own load() for your custom file format. But
this way, you can try to find the best way to layout your data
in hdf5 format.
3) in addition to 2) you could try to save each entry of your
"filters" array as one dataset in a given hdf5 file.
Did you search on bugzilla whether this bug was already submitted?
Could you try to report it?
Antoine
/////////////////////////////////
N = 4;
n = 10000;
filters = list();
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end
tic();
save('filters.dat', filters);
ts1 = toc();
tic();
save('filters.dat', 'filters');
ts2 = toc();
printf("old save %.2fs\n", ts1);
printf("new save %.2fs\n", ts2);
printf("slowdown %.1f\n", ts2/ts1);
/////////////////////////////////
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE
Tel:+33 5 61 33 64 59
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE
Tel:+33 5 61 33 64 59
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
_______________________________________________
users mailing list
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
<https://antispam.utc.fr/proxy/2/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users>
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la MatiÚre Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de CompiÚgne
CS 60319, 60203 CompiÚgne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet
<https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/www.utc.fr/%7Emottelet>
_______________________________________________
users mailing list
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la MatiÚre Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de CompiÚgne
CS 60319, 60203 CompiÚgne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet
Clément DAVID
2018-10-15 13:47:22 UTC
Permalink
Hello all,

Correct, I experienced such a slowness while working with Xcos diagrams for Scilab 5. At first we considered HDF5 for this deep nested list / mlist data-structure storage however after some tests ; XML might be used for tree-like storage and HDF5 (or Java types serialization) for big matrices.

AFAIK currently there is no easy way to load/save specifying a format other than HDF5 ; maybe adding xmlSave/xmlLoad sci_gateway to let the user select an xml file format for any Scilab structure might provide better performance on your use-case. JSON might also be another candidate to look at for decent serialization support.

PS: Scilab 5.5.1 load/save are direct memory dump so this is really the fastest you can get from Scilab ; HDF5 binary format is good enough for matrices
--
Clément

From: users <users-***@lists.scilab.org> On Behalf Of Stéphane Mottelet
Sent: Monday, October 15, 2018 2:36 PM
To: ***@lists.scilab.org
Subject: Re: [Scilab-users] HDF5 save is super slow

Hello,

I looked a little bit in the sources: the evident bottleneck is the nested creation of an hdf5 group each time that a container variable is met.
For the given example, this is particularly evident. If you replace the syslin structure by the corresponding [A,B;C,D] matrix, then save is ten times faster:

N = 4;
n = 1000;
filters = list();
for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = G;
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());

0.724754

N = 4;
n = 1000;
filters = list()
for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = [G.a G.b;G.c G.d];
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());

0.082302

Serializing container objects seems to be the solution, but it goes towards an orthogonal direction w.r.t. the hdf5 portability spirit.

S.


Le 15/10/2018 à 12:22, Antoine Monmayrant a écrit :
Le 15/10/2018 à 11:55, Arvid Rosén a écrit :
Hi,

Thanks for getting back to me!

Unfortunately, we used Scilab’s pretty cool way of doing object orientation, so we have big nested tlist structures with multiple instances of various lists of filters and other structures, as in my example. Saving those structures in some explicit manual way would be extremely complicated. Or is there some way of writing explicit HDF5 saving/loading schemes using overloading? That would be great! I am sure we could find the main culprits and do something explicit for them, but as they can be located wherever in a big nested structure, it would be painful to do anything on the top level.

Another, related I guess, problem here is that the new file format uses about 15 times as much disk space as the old format (for a typical ill-behaved nested structure). That adds to the save/load time too I guess, but is probably not the main source here.
Argh, yes, I tested it and in your example, I have a file x8.5 bigger.
I think that both increases in time and size are real issues and should be reported as bugs.

By the way, I rewrote your script to run it under both 6.0 and 5.5:

/////////////////////////////////
N = 4;
n = 10000;
filters = list();

for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = G;
end

ver=getversion('scilab');

if ver(1)<6 then
tic();
save('filters_old.dat', filters);
ts1 = toc();
else
tic();
save('filters_new.dat', 'filters');
ts1 = toc();
end

printf("Time for save %.2fs\n", ts1);
/////////////////////////////////

Hope it helps,

Antoine



I think I might have reported this earlier using Bugzilla, but I’m not sure. I’ll check and report it if not.

Cheers,
Arvid

From: users <users-***@lists.scilab.org><mailto:users-***@lists.scilab.org> on behalf of "***@laas.fr"<mailto:***@laas.fr> <***@laas.fr><mailto:***@laas.fr>
Reply-To: "***@laas.fr"<mailto:***@laas.fr> <***@laas.fr><mailto:***@laas.fr>, Users mailing list for Scilab <***@lists.scilab.org><mailto:***@lists.scilab.org>
Date: Monday, 15 October 2018 at 11:08
To: "***@lists.scilab.org"<mailto:***@lists.scilab.org> <***@lists.scilab.org><mailto:***@lists.scilab.org>
Subject: Re: [Scilab-users] HDF5 save is super slow

Hello,

I tried your code in 5.5.1 and the last nightly-build of 6.0: I see a slowdown of around 175 between old save in 5.5.1 and new (and only) save in 6.0.
It's really related to the data structure, because we use hdf5 read/write a lot here and did not experience significant slowdowns using 6.0.
I think the overhead might come to the translation of your fairly complex variable (a long array of tlist) in the corresponding hdf5 structure.
In the old save, this translation was not necessary.
Maybe you could try to save your data in a different way.
For example:
3) you could save each element of "filters" in a separate file.
2) you could bypass save and directly write your data in a hdf5 file by using h5open(), h5write() directly. It means you need to write your own load() for your custom file format. But this way, you can try to find the best way to layout your data in hdf5 format.
3) in addition to 2) you could try to save each entry of your "filters" array as one dataset in a given hdf5 file.

Did you search on bugzilla whether this bug was already submitted?
Could you try to report it?


Antoine

Le 15/10/2018 à 10:11, Arvid Rosén a écrit :
/////////////////////////////////
N = 4;
n = 10000;

filters = list();

for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = G;
end

tic();
save('filters.dat', filters);
ts1 = toc();

tic();
save('filters.dat', 'filters');
ts2 = toc();

printf("old save %.2fs\n", ts1);
printf("new save %.2fs\n", ts2);
printf("slowdown %.1f\n", ts2/ts1);
/////////////////////////////////
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++



Antoine Monmayrant LAAS - CNRS

7 avenue du Colonel Roche

BP 54200

31031 TOULOUSE Cedex 4

FRANCE



Tel:+33 5 61 33 64 59



email : ***@laas.fr<mailto:***@laas.fr>

permanent email : ***@polytechnique.org<mailto:***@polytechnique.org>



+++++++++++++++++++++++++++++++++++++++++++++++++++++++
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++



Antoine Monmayrant LAAS - CNRS

7 avenue du Colonel Roche

BP 54200

31031 TOULOUSE Cedex 4

FRANCE



Tel:+33 5 61 33 64 59



email : ***@laas.fr<mailto:***@laas.fr>

permanent email : ***@polytechnique.org<mailto:***@polytechnique.org>



+++++++++++++++++++++++++++++++++++++++++++++++++++++++






_______________________________________________

users mailing list

***@lists.scilab.org<mailto:***@lists.scilab.org>

https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
--
Stéphane Mottelet

Ingénieur de recherche

EA 4297 Transformations Intégrées de la MatiÚre Renouvelable

Département Génie des Procédés Industriels

Sorbonne Universités - Université de Technologie de CompiÚgne

CS 60319, 60203 CompiÚgne cedex

Tel : +33(0)344234688

http://www.utc.fr/~mottelet<http://www.utc.fr/%7Emottelet>
Arvid Rosén
2018-10-15 16:17:01 UTC
Permalink
Hi again,

I just filed a bug report here:
http://bugzilla.scilab.org/show_bug.cgi?id=15809

Would it be possible to bring back the old mem-dump approach in scilab 6? I mean, could I write a gateway that just takes a pointer to the first byte in memory, figures out the size, and dumps to disk? Or maybe it doesn’t work like that. Writing a JSON exporter for storing filter coefficients in a math software package seems a bit ridicules, but hey, if it works it might be worth it in our case.

Cheers,
Arvid


From: users <users-***@lists.scilab.org> on behalf of Clément DAVID <***@scilab-enterprises.com>
Reply-To: Users mailing list for Scilab <***@lists.scilab.org>
Date: Monday, 15 October 2018 at 15:48
To: Users mailing list for Scilab <***@lists.scilab.org>
Cc: Clément David <***@esi-group.com>
Subject: Re: [Scilab-users] HDF5 save is super slow

Hello all,

Correct, I experienced such a slowness while working with Xcos diagrams for Scilab 5. At first we considered HDF5 for this deep nested list / mlist data-structure storage however after some tests ; XML might be used for tree-like storage and HDF5 (or Java types serialization) for big matrices.

AFAIK currently there is no easy way to load/save specifying a format other than HDF5 ; maybe adding xmlSave/xmlLoad sci_gateway to let the user select an xml file format for any Scilab structure might provide better performance on your use-case. JSON might also be another candidate to look at for decent serialization support.

PS: Scilab 5.5.1 load/save are direct memory dump so this is really the fastest you can get from Scilab ; HDF5 binary format is good enough for matrices
--
Clément

From: users <users-***@lists.scilab.org> On Behalf Of Stéphane Mottelet
Sent: Monday, October 15, 2018 2:36 PM
To: ***@lists.scilab.org
Subject: Re: [Scilab-users] HDF5 save is super slow

Hello,

I looked a little bit in the sources: the evident bottleneck is the nested creation of an hdf5 group each time that a container variable is met.
For the given example, this is particularly evident. If you replace the syslin structure by the corresponding [A,B;C,D] matrix, then save is ten times faster:

N = 4;
n = 1000;
filters = list();
for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = G;
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());

0.724754

N = 4;
n = 1000;
filters = list()
for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = [G.a G.b;G.c G.d];
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());

0.082302

Serializing container objects seems to be the solution, but it goes towards an orthogonal direction w.r.t. the hdf5 portability spirit.

S.


Le 15/10/2018 à 12:22, Antoine Monmayrant a écrit :
Le 15/10/2018 à 11:55, Arvid Rosén a écrit :
Hi,

Thanks for getting back to me!

Unfortunately, we used Scilab’s pretty cool way of doing object orientation, so we have big nested tlist structures with multiple instances of various lists of filters and other structures, as in my example. Saving those structures in some explicit manual way would be extremely complicated. Or is there some way of writing explicit HDF5 saving/loading schemes using overloading? That would be great! I am sure we could find the main culprits and do something explicit for them, but as they can be located wherever in a big nested structure, it would be painful to do anything on the top level.

Another, related I guess, problem here is that the new file format uses about 15 times as much disk space as the old format (for a typical ill-behaved nested structure). That adds to the save/load time too I guess, but is probably not the main source here.
Argh, yes, I tested it and in your example, I have a file x8.5 bigger.
I think that both increases in time and size are real issues and should be reported as bugs.

By the way, I rewrote your script to run it under both 6.0 and 5.5:

/////////////////////////////////
N = 4;
n = 10000;
filters = list();

for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = G;
end

ver=getversion('scilab');

if ver(1)<6 then
tic();
save('filters_old.dat', filters);
ts1 = toc();
else
tic();
save('filters_new.dat', 'filters');
ts1 = toc();
end

printf("Time for save %.2fs\n", ts1);
/////////////////////////////////

Hope it helps,

Antoine




I think I might have reported this earlier using Bugzilla, but I’m not sure. I’ll check and report it if not.

Cheers,
Arvid

From: users <users-***@lists.scilab.org><mailto:users-***@lists.scilab.org> on behalf of "***@laas.fr"<mailto:***@laas.fr> <***@laas.fr><mailto:***@laas.fr>
Reply-To: "***@laas.fr"<mailto:***@laas.fr> <***@laas.fr><mailto:***@laas.fr>, Users mailing list for Scilab <***@lists.scilab.org><mailto:***@lists.scilab.org>
Date: Monday, 15 October 2018 at 11:08
To: "***@lists.scilab.org"<mailto:***@lists.scilab.org> <***@lists.scilab.org><mailto:***@lists.scilab.org>
Subject: Re: [Scilab-users] HDF5 save is super slow

Hello,

I tried your code in 5.5.1 and the last nightly-build of 6.0: I see a slowdown of around 175 between old save in 5.5.1 and new (and only) save in 6.0.
It's really related to the data structure, because we use hdf5 read/write a lot here and did not experience significant slowdowns using 6.0.
I think the overhead might come to the translation of your fairly complex variable (a long array of tlist) in the corresponding hdf5 structure.
In the old save, this translation was not necessary.
Maybe you could try to save your data in a different way.
For example:
3) you could save each element of "filters" in a separate file.
2) you could bypass save and directly write your data in a hdf5 file by using h5open(), h5write() directly. It means you need to write your own load() for your custom file format. But this way, you can try to find the best way to layout your data in hdf5 format.
3) in addition to 2) you could try to save each entry of your "filters" array as one dataset in a given hdf5 file.

Did you search on bugzilla whether this bug was already submitted?
Could you try to report it?


Antoine

Le 15/10/2018 à 10:11, Arvid Rosén a écrit :
/////////////////////////////////
N = 4;
n = 10000;

filters = list();

for i=1:n
G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
filters($+1) = G;
end

tic();
save('filters.dat', filters);
ts1 = toc();

tic();
save('filters.dat', 'filters');
ts2 = toc();

printf("old save %.2fs\n", ts1);
printf("new save %.2fs\n", ts2);
printf("slowdown %.1f\n", ts2/ts1);
/////////////////////////////////
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++



Antoine Monmayrant LAAS - CNRS

7 avenue du Colonel Roche

BP 54200

31031 TOULOUSE Cedex 4

FRANCE



Tel:+33 5 61 33 64 59



email : ***@laas.fr<mailto:***@laas.fr>

permanent email : ***@polytechnique.org<mailto:***@polytechnique.org>



+++++++++++++++++++++++++++++++++++++++++++++++++++++++
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++



Antoine Monmayrant LAAS - CNRS

7 avenue du Colonel Roche

BP 54200

31031 TOULOUSE Cedex 4

FRANCE



Tel:+33 5 61 33 64 59



email : ***@laas.fr<mailto:***@laas.fr>

permanent email : ***@polytechnique.org<mailto:***@polytechnique.org>



+++++++++++++++++++++++++++++++++++++++++++++++++++++++







_______________________________________________

users mailing list

***@lists.scilab.org<mailto:***@lists.scilab.org>

https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
--
Stéphane Mottelet

Ingénieur de recherche

EA 4297 Transformations Intégrées de la MatiÚre Renouvelable

Département Génie des Procédés Industriels

Sorbonne Universités - Université de Technologie de CompiÚgne

CS 60319, 60203 CompiÚgne cedex

Tel : +33(0)344234688

http://www.utc.fr/~mottelet<http://www.utc.fr/%7Emottelet>
a***@laas.fr
2018-10-16 07:52:11 UTC
Permalink
Post by Arvid Rosén
Hi again,
http://bugzilla.scilab.org/show_bug.cgi?id=15809
Would it be possible to bring back the old mem-dump approach in scilab 6?
Couldn't you create your own atom package that restore this raw memory
dump for scilab 6.0?
I understand why we moved away from this model, but it seems to be key
for you.
There is always a trade-off between portability (and robustness) and raw
speed...
Post by Arvid Rosén
I mean, could I write a gateway that just takes a pointer to the first
byte in memory, figures out the size, and dumps to disk? Or maybe it
doesn’t work like that. Writing a JSON exporter for storing filter
coefficients in a math software package seems a bit ridicules, but
hey, if it works it might be worth it in our case.
I was also wondering whether this can be done in HDF5: ie do some
serialization of your structure and dump it in hdf5?
We use hdf5 for Labview and for some horrible structures (like arrays of
clusters containing lots of elements of different types), we just turn
them into byte stream and dump the stream in an hdf5 dataset.
We then retrieve it and rebuild the structure (knowing its shape).
Could this be implemented in Scilab 6?
What could be missing, the any variable -> bytestream conversion and the
way back?

Antoine
Post by Arvid Rosén
Cheers,
Arvid
*Date: *Monday, 15 October 2018 at 15:48
*Subject: *Re: [Scilab-users] HDF5 save is super slow
Hello all,
Correct, I experienced such a slowness while working with Xcos
diagrams for Scilab 5. At first we considered HDF5 for this deep
nested list / mlist data-structure storage however after some tests ;
XML might be used for tree-like storage and HDF5 (or Java types
serialization) for big matrices.
AFAIK currently there is no easy way to load/save specifying a format
other than HDF5 ; maybe adding xmlSave/xmlLoad sci_gateway to let the
user select an xml file format for any Scilab structure might provide
better performance on your use-case. JSON might also be another
candidate to look at for decent serialization support.
PS: Scilab 5.5.1 load/save are direct memory dump so this is really
the fastest you can get from Scilab ; HDF5 binary format is good
enough for matrices
--
Clément
Mottelet
*Sent:* Monday, October 15, 2018 2:36 PM
*Subject:* Re: [Scilab-users] HDF5 save is super slow
Hello,
I looked a little bit in the sources: the evident bottleneck is the
nested creation of an hdf5 group each time that a container variable is met.
For the given example, this is particularly evident. If you replace
the syslin structure by the corresponding [A,B;C,D] matrix, then save
N = 4;
n = 1000;
filters = list();
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());
   0.724754
N = 4;
n = 1000;
filters = list()
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = [G.a G.b;G.c G.d];
end
tic();
save('filters.dat', 'filters');
disp(toc());
--> disp(toc());
   0.082302
Serializing container objects seems to be the solution, but it goes
towards an orthogonal direction w.r.t. the hdf5 portability spirit.
S.
Hi,
Thanks for getting back to me!
Unfortunately, we used Scilab’s pretty cool way of doing
object orientation, so we have big nested tlist structures
with multiple instances of various lists of filters and other
structures, as in my example. Saving those structures in some
explicit manual way would be extremely complicated. Or is
there some way of writing explicit HDF5 saving/loading schemes
using overloading? That would be great! I am sure we could
find the main culprits and do something explicit for them, but
as they can be located wherever in a big nested structure, it
would be painful to do anything on the top level.
Another, related I guess, problem here is that the new file
format uses about 15 times as much disk space as the old
format (for a typical ill-behaved nested structure). That adds
to the save/load time too I guess, but is probably not the
main source here.
Argh, yes, I tested it and in your example, I have a file x8.5 bigger.
I think that both increases in time and size are real issues and
should be reported as bugs.
/////////////////////////////////
N = 4;
n = 10000;
filters = list();
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end
ver=getversion('scilab');
if ver(1)<6 then
    tic();
    save('filters_old.dat', filters);
    ts1 = toc();
else
    tic();
    save('filters_new.dat', 'filters');
    ts1 = toc();
end
printf("Time for save %.2fs\n", ts1);
/////////////////////////////////
Hope it helps,
Antoine
I think I might have reported this earlier using Bugzilla, but
I’m not sure. I’ll check and report it if not.
Cheers,
Arvid
*Date: *Monday, 15 October 2018 at 11:08
*Subject: *Re: [Scilab-users] HDF5 save is super slow
Hello,
I see a slowdown of around 175 between old save in 5.5.1 and
new (and only) save in 6.0.
It's really related to the data structure, because we use hdf5
read/write a lot here and did not experience significant
slowdowns using 6.0.
I think the overhead might come to the translation of your
fairly complex variable (a long array of tlist) in the
corresponding hdf5 structure.
In the old save, this translation was not necessary.
Maybe you could try to save your data in a different way.
3) you could save each element of "filters" in a separate file.
2) you could bypass save and directly write your data in a
hdf5 file by using h5open(), h5write() directly. It means you
need to write your own load() for your custom file format. But
this way, you can try to find the best way to layout your data
in hdf5 format.
3) in addition to 2) you could try to save each entry of your
"filters" array as one dataset in a given hdf5 file.
Did you search on bugzilla whether this bug was already submitted?
Could you try to report it?
Antoine
/////////////////////////////////
N = 4;
n = 10000;
filters = list();
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end
tic();
save('filters.dat', filters);
ts1 = toc();
tic();
save('filters.dat', 'filters');
ts2 = toc();
printf("old save %.2fs\n", ts1);
printf("new save %.2fs\n", ts2);
printf("slowdown %.1f\n", ts2/ts1);
/////////////////////////////////
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE
Tel:+33 5 61 33 64 59
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE
Tel:+33 5 61 33 64 59
+++++++++++++++++++++++++++++++++++++++++++++++++++++++
_______________________________________________
users mailing list
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la MatiÚre Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de CompiÚgne
CS 60319, 60203 CompiÚgne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet <http://www.utc.fr/%7Emottelet>
_______________________________________________
users mailing list
http://lists.scilab.org/mailman/listinfo/users
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE

Tel:+33 5 61 33 64 59

email : ***@laas.fr
permanent email : ***@polytechnique.org

+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Arvid Rosén
2018-10-16 11:01:27 UTC
Permalink
From: users <users-***@lists.scilab.org> on behalf of "***@laas.fr" <***@laas.fr>
Reply-To: "***@laas.fr" <***@laas.fr>, Users mailing list for Scilab <***@lists.scilab.org>
Date: Tuesday, 16 October 2018 at 09:53
To: "***@lists.scilab.org" <***@lists.scilab.org>
Subject: Re: [Scilab-users] HDF5 save is super slow

Couldn't you create your own atom package that restore this raw memory dump for scilab 6.0?
I understand why we moved away from this model, but it seems to be key for you.
There is always a trade-off between portability (and robustness) and raw speed...

Yeah, if that was possible, I would certainly do it. We already have a bunch of C/C++ binaries that we compile and link dynamically, but for that to be easy to implement, I guess the lists and structures need to be stored linearly in one consecutive chunk of memory. I don’t know if that is the case. Anyone? C++ integrations and gateways are very poorly documented at the moment.
Otherwise, I would need to do some recursive implementation, that handles a bunch of different object types. Sounds painful.

Cheers,
Arvid
Clément DAVID
2018-10-18 12:09:38 UTC
Permalink
Hello,

My 2cents, this is probably a poor man’s approach but Xcos offers vec2var / var2vec functions that encode in a double vector any Scilab datatypes passed as arguments. The encoding duplicates the data in memory so there might be some overhead.

On my machine, I have these timings using the attached script (Antoine’s one edited):
save list of syslins: 1.361704
save list of vec[]: 0.056788
save var2vec(list of syslins): 0.014411

Discarding hdf5 groups creation is a huge performance win but remove any way to create clean hdf5 (eg. to address subgroups directly).

Thanks,
--
Clément

From: users <users-***@lists.scilab.org> On Behalf Of Arvid Rosén
Sent: Tuesday, October 16, 2018 1:01 PM
To: ***@laas.fr; Users mailing list for Scilab <***@lists.scilab.org>
Subject: Re: [Scilab-users] HDF5 save is super slow

From: users <users-***@lists.scilab.org<mailto:users-***@lists.scilab.org>> on behalf of "***@laas.fr<mailto:***@laas.fr>" <***@laas.fr<mailto:***@laas.fr>>
Reply-To: "***@laas.fr<mailto:***@laas.fr>" <***@laas.fr<mailto:***@laas.fr>>, Users mailing list for Scilab <***@lists.scilab.org<mailto:***@lists.scilab.org>>
Date: Tuesday, 16 October 2018 at 09:53
To: "***@lists.scilab.org<mailto:***@lists.scilab.org>" <***@lists.scilab.org<mailto:***@lists.scilab.org>>
Subject: Re: [Scilab-users] HDF5 save is super slow

Couldn't you create your own atom package that restore this raw memory dump for scilab 6.0?
I understand why we moved away from this model, but it seems to be key for you.
There is always a trade-off between portability (and robustness) and raw speed...

Yeah, if that was possible, I would certainly do it. We already have a bunch of C/C++ binaries that we compile and link dynamically, but for that to be easy to implement, I guess the lists and structures need to be stored linearly in one consecutive chunk of memory. I don’t know if that is the case. Anyone? C++ integrations and gateways are very poorly documented at the moment.
Otherwise, I would need to do some recursive implementation, that handles a bunch of different object types. Sounds painful.

Cheers,
Arvid
Stéphane Mottelet
2018-10-18 12:39:04 UTC
Permalink
Hello Clément,
Post by a***@laas.fr
Hello,
My 2cents, this is probably a poor man’s approach but Xcos offers
vec2var / var2vec functions that encode in a double vector any Scilab
datatypes passed as arguments. The encoding duplicates the data in
memory so there might be some overhead.
Do you think it would be complicated to continuously write the
serialized data on the disk ?
Post by a***@laas.fr
On my machine, I have these timings using the attached script
save _list_ of _syslins_: 1.361704
save _list_ of vec[]: 0.056788
save var2vec(list of _syslins_): 0.014411
Discarding hdf5 groups creation is a huge performance win but remove
_any way_ to create clean hdf5 (eg. to address subgroups directly).
Thanks,
--
Clément
*Sent:* Tuesday, October 16, _2018_ 1:01 PM
*Subject:* Re: [Scilab-users] HDF5 save is super slow
*Date: *Tuesday, 16 October 2018 at 09:53
*Subject: *Re: [Scilab-users] HDF5 save is super slow
Couldn't you create your own atom package that _restore_ this raw
memory dump for _scilab_ 6.0?
I understand why we moved away from this model, but it seems to be key for you.
There is always a trade-off between portability (and robustness) and raw speed...
Yeah, if that was possible, I would certainly do it. We already have a
bunch of C/C++ binaries that we compile and link dynamically, but for
that to be easy to implement, I guess the lists and structures need to
be stored linearly in one consecutive chunk of memory. I don’t know if
that is the case. Anyone? C++ integrations and gateways are very
poorly documented at the moment.
Otherwise, I would need to do some recursive implementation, that
handles a bunch of different object types. Sounds painful.
Cheers,
Arvid
_______________________________________________
users mailing list
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la MatiÚre Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de CompiÚgne
CS 60319, 60203 CompiÚgne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet
Clément DAVID
2018-10-18 13:15:36 UTC
Permalink
Hello Stephane,

TL ;DR ; HDF5 is a cross-platform, cross-language, portable file format used in almost all scientific software these days. Please use this sane default !

Writing a custom serialization scheme (like the one provided by vec2var / var2vec) might not be complicated to implement however the hard part is maintaining and describing a serialization format to be used in the long term.

Using Scilab 5, the “stack” save and load functions were almost trivial as they are directly mapped from memory to disk; the format used is “the stack” so it is known and used everywhere (even for custom string encoding). This vec2var serialization is only used internally (to pass block parameters around), does not respect any described format nor validate against any documentation and is not portable; in the long term, I won’t promise it to be stable. Implementing your own serialization scheme will probably lead your software into trouble. Really, it isn’t easy in the long term! The HDF5 format is described, its serialized data are browsable (through hdfview) and does not cope with low-level requirements.

To me, the issue is really a performance bug. We might find a way to fix it within Scilab rather than provide a workaround (with custom encodings). The hdf5 library is a bug one, maybe with a clever understanding of its internal serialization, we might find a better execution path for this use-case (without changing the file format).

Thanks,
--
Clément

From: users <users-***@lists.scilab.org> On Behalf Of Stéphane Mottelet
Sent: Thursday, October 18, 2018 2:39 PM
To: ***@lists.scilab.org
Subject: Re: [Scilab-users] HDF5 save is super slow

Hello Clément,

Le 18/10/2018 à 14:09, Clément DAVID a écrit :
Hello,

My 2cents, this is probably a poor man’s approach but Xcos offers vec2var / var2vec functions that encode in a double vector any Scilab datatypes passed as arguments. The encoding duplicates the data in memory so there might be some overhead.
Do you think it would be complicated to continuously write the serialized data on the disk ?

On my machine, I have these timings using the attached script (Antoine’s one edited):
save list of syslins: 1.361704
save list of vec[]: 0.056788
save var2vec(list of syslins): 0.014411

Discarding hdf5 groups creation is a huge performance win but remove any way to create clean hdf5 (eg. to address subgroups directly).

Thanks,
--
Clément

From: users <users-***@lists.scilab.org><mailto:users-***@lists.scilab.org> On Behalf Of Arvid Rosén
Sent: Tuesday, October 16, 2018 1:01 PM
To: ***@laas.fr<mailto:***@laas.fr>; Users mailing list for Scilab <***@lists.scilab.org><mailto:***@lists.scilab.org>
Subject: Re: [Scilab-users] HDF5 save is super slow

From: users <users-***@lists.scilab.org<mailto:users-***@lists.scilab.org>> on behalf of "***@laas.fr<mailto:***@laas.fr>" <***@laas.fr<mailto:***@laas.fr>>
Reply-To: "***@laas.fr<mailto:***@laas.fr>" <***@laas.fr<mailto:***@laas.fr>>, Users mailing list for Scilab <***@lists.scilab.org<mailto:***@lists.scilab.org>>
Date: Tuesday, 16 October 2018 at 09:53
To: "***@lists.scilab.org<mailto:***@lists.scilab.org>" <***@lists.scilab.org<mailto:***@lists.scilab.org>>
Subject: Re: [Scilab-users] HDF5 save is super slow

Couldn't you create your own atom package that restore this raw memory dump for scilab 6.0?
I understand why we moved away from this model, but it seems to be key for you.
There is always a trade-off between portability (and robustness) and raw speed...

Yeah, if that was possible, I would certainly do it. We already have a bunch of C/C++ binaries that we compile and link dynamically, but for that to be easy to implement, I guess the lists and structures need to be stored linearly in one consecutive chunk of memory. I don’t know if that is the case. Anyone? C++ integrations and gateways are very poorly documented at the moment.
Otherwise, I would need to do some recursive implementation, that handles a bunch of different object types. Sounds painful.

Cheers,
Arvid




_______________________________________________

users mailing list

***@lists.scilab.org<mailto:***@lists.scilab.org>

https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
--
Stéphane Mottelet

Ingénieur de recherche

EA 4297 Transformations Intégrées de la MatiÚre Renouvelable

Département Génie des Procédés Industriels

Sorbonne Universités - Université de Technologie de CompiÚgne

CS 60319, 60203 CompiÚgne cedex

Tel : +33(0)344234688

http://www.utc.fr/~mottelet
antoine monmayrant
2018-10-18 12:46:57 UTC
Permalink
Post by a***@laas.fr
Hello,
My 2cents, this is probably a poor man’s approach but Xcos offers vec2var / var2vec functions that encode in a double vector any Scilab datatypes passed as arguments. The encoding duplicates the data in memory so there might be some overhead.
Er, I tried var2vec, but it does not work with structures:
--> typeof(t)
 ans  =
 st

--> var2vec(t)
var2vec: Wrong type for input argument #1: Double, Integer, Boolean,
String or List type.

Arghh... so var2vec does not work for any datatype right?

Antoine
Post by a***@laas.fr
save list of syslins: 1.361704
save list of vec[]: 0.056788
save var2vec(list of syslins): 0.014411
Discarding hdf5 groups creation is a huge performance win but remove any way to create clean hdf5 (eg. to address subgroups directly).
Thanks,
--
Clément
Sent: Tuesday, October 16, 2018 1:01 PM
Subject: Re: [Scilab-users] HDF5 save is super slow
Date: Tuesday, 16 October 2018 at 09:53
Subject: Re: [Scilab-users] HDF5 save is super slow
Couldn't you create your own atom package that restore this raw memory dump for scilab 6.0?
I understand why we moved away from this model, but it seems to be key for you.
There is always a trade-off between portability (and robustness) and raw speed...
Yeah, if that was possible, I would certainly do it. We already have a bunch of C/C++ binaries that we compile and link dynamically, but for that to be easy to implement, I guess the lists and structures need to be stored linearly in one consecutive chunk of memory. I don’t know if that is the case. Anyone? C++ integrations and gateways are very poorly documented at the moment.
Otherwise, I would need to do some recursive implementation, that handles a bunch of different object types. Sounds painful.
Cheers,
Arvid
Clément David
2018-10-18 12:56:25 UTC
Permalink
Hi Antoine,

That one point, vec2var has been defined to pass some datatypes from Scilab "ast" (C++ side, data pointers, refcounted) to Scilab "scicos" (C, raw memory allocated once and passed around). Some data structures might not be handled correctly, I was even surprised that mlists worked correctly.

Scilab Struct (or Cell) are missing as they are more complex datatypes to serialize. Handle are even harder (as you need to list the properties somewhere). Feel free to take a look at the code [1],

[1]: http://cgit.scilab.org/scilab/tree/scilab/modules/scicos/src/cpp/var2vec.cpp?h=6.0#n243

Cheers,
--
Clément

-----Original Message-----
From: antoine monmayrant <***@laas.fr>
Sent: Thursday, October 18, 2018 2:47 PM
To: Clément DAVID <***@scilab-enterprises.com>; Users mailing list for Scilab <***@lists.scilab.org>
Cc: Clément David <***@esi-group.com>
Subject: Re: [Scilab-users] HDF5 save is super slow
Post by a***@laas.fr
Hello,
My 2cents, this is probably a poor man’s approach but Xcos offers vec2var / var2vec functions that encode in a double vector any Scilab datatypes passed as arguments. The encoding duplicates the data in memory so there might be some overhead.
Er, I tried var2vec, but it does not work with structures:
--> typeof(t)
 ans  =
 st

--> var2vec(t)
var2vec: Wrong type for input argument #1: Double, Integer, Boolean, String or List type.

Arghh... so var2vec does not work for any datatype right?

Antoine
Post by a***@laas.fr
save list of syslins: 1.361704
save list of vec[]: 0.056788
save var2vec(list of syslins): 0.014411
Discarding hdf5 groups creation is a huge performance win but remove any way to create clean hdf5 (eg. to address subgroups directly).
Thanks,
--
Clément
Sent: Tuesday, October 16, 2018 1:01 PM
Subject: Re: [Scilab-users] HDF5 save is super slow
From: users
mailing list for Scilab
Date: Tuesday, 16 October 2018 at 09:53
Subject: Re: [Scilab-users] HDF5 save is super slow
Couldn't you create your own atom package that restore this raw memory dump for scilab 6.0?
I understand why we moved away from this model, but it seems to be key for you.
There is always a trade-off between portability (and robustness) and raw speed...
Yeah, if that was possible, I would certainly do it. We already have a bunch of C/C++ binaries that we compile and link dynamically, but for that to be easy to implement, I guess the lists and structures need to be stored linearly in one consecutive chunk of memory. I don’t know if that is the case. Anyone? C++ integrations and gateways are very poorly documented at the moment.
Otherwise, I would need to do some recursive implementation, that handles a bunch of different object types. Sounds painful.
Cheers,
Arvid
Stéphane Mottelet
2018-10-18 13:00:56 UTC
Permalink
Hello again,
Post by Clément David
Hi Antoine,
That one point, vec2var has been defined to pass some datatypes from Scilab "ast" (C++ side, data pointers, refcounted) to Scilab "scicos" (C, raw memory allocated once and passed around). Some data structures might not be handled correctly, I was even surprised that mlists worked correctly.
Scilab Struct (or Cell) are missing as they are more complex datatypes to serialize. Handle are even harder (as you need to list the properties somewhere). Feel free to take a look at the code [1],
[1]: https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/cgit.scilab.org/scilab/tree/scilab/modules/scicos/src/cpp/var2vec.cpp?h=6.0#n243
Why is the code for structs (lines 242--74)  commented out ? Is it
broken or else ?
Post by Clément David
Cheers,
--
Clément
-----Original Message-----
Sent: Thursday, October 18, 2018 2:47 PM
Subject: Re: [Scilab-users] HDF5 save is super slow
Post by a***@laas.fr
Hello,
My 2cents, this is probably a poor man’s approach but Xcos offers vec2var / var2vec functions that encode in a double vector any Scilab datatypes passed as arguments. The encoding duplicates the data in memory so there might be some overhead.
--> typeof(t)
 ans  =
 st
--> var2vec(t)
var2vec: Wrong type for input argument #1: Double, Integer, Boolean, String or List type.
Arghh... so var2vec does not work for any datatype right?
Antoine
Post by a***@laas.fr
save list of syslins: 1.361704
save list of vec[]: 0.056788
save var2vec(list of syslins): 0.014411
Discarding hdf5 groups creation is a huge performance win but remove any way to create clean hdf5 (eg. to address subgroups directly).
Thanks,
--
Clément
Sent: Tuesday, October 16, 2018 1:01 PM
Subject: Re: [Scilab-users] HDF5 save is super slow
From: users
mailing list for Scilab
Date: Tuesday, 16 October 2018 at 09:53
Subject: Re: [Scilab-users] HDF5 save is super slow
Couldn't you create your own atom package that restore this raw memory dump for scilab 6.0?
I understand why we moved away from this model, but it seems to be key for you.
There is always a trade-off between portability (and robustness) and raw speed...
Yeah, if that was possible, I would certainly do it. We already have a bunch of C/C++ binaries that we compile and link dynamically, but for that to be easy to implement, I guess the lists and structures need to be stored linearly in one consecutive chunk of memory. I don’t know if that is the case. Anyone? C++ integrations and gateways are very poorly documented at the moment.
Otherwise, I would need to do some recursive implementation, that handles a bunch of different object types. Sounds painful.
Cheers,
Arvid
_______________________________________________
users mailing list
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la Matière Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de Compiègne
CS 60319, 60203 Compiègne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet
a***@laas.fr
2018-10-18 13:07:44 UTC
Permalink
Post by Stéphane Mottelet
Hello again,
Post by Clément David
Hi Antoine,
That one point, vec2var has been defined to pass some datatypes from
Scilab "ast" (C++ side, data pointers, refcounted) to Scilab "scicos"
(C, raw memory allocated once and passed around). Some data
structures might not be handled correctly, I was even surprised that
mlists worked correctly.
Scilab Struct (or Cell) are missing as they are more complex
datatypes to serialize. Handle are even harder (as you need to list
the properties somewhere). Feel free to take a look at the code [1],
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/cgit.scilab.org/scilab/tree/scilab/modules/scicos/src/cpp/var2vec.cpp?h=6.0#n243
Why is the code for structs (lines 242--74)  commented out ? Is it
broken or else ?
If var2vec() / vec2var() could be extended to provide a universal way to
serialize / deserialize really any scilab variable, that would be really
nice.
Could we make a SEP or fill a bug as a wish ?

Antoine
Post by Stéphane Mottelet
Post by Clément David
Cheers,
--
Clément
-----Original Message-----
Sent: Thursday, October 18, 2018 2:47 PM
Subject: Re: [Scilab-users] HDF5 save is super slow
Post by a***@laas.fr
Hello,
My 2cents, this is probably a poor man’s approach but Xcos offers
vec2var / var2vec functions that encode in a double vector any
Scilab datatypes passed as arguments. The encoding duplicates the
data in memory so there might be some overhead.
--> typeof(t)
   ans  =
   st
--> var2vec(t)
var2vec: Wrong type for input argument #1: Double, Integer, Boolean, String or List type.
Arghh... so var2vec does not work for any datatype right?
Antoine
Post by a***@laas.fr
save list of syslins: 1.361704
save list of vec[]: 0.056788
save var2vec(list of syslins): 0.014411
Discarding hdf5 groups creation is a huge performance win but remove
any way to create clean hdf5 (eg. to address subgroups directly).
Thanks,
--
Clément
Sent: Tuesday, October 16, 2018 1:01 PM
Subject: Re: [Scilab-users] HDF5 save is super slow
From: users
mailing list for Scilab
Date: Tuesday, 16 October 2018 at 09:53
Subject: Re: [Scilab-users] HDF5 save is super slow
Couldn't you create your own atom package that restore this raw
memory dump for scilab 6.0?
I understand why we moved away from this model, but it seems to be key for you.
There is always a trade-off between portability (and robustness) and raw speed...
Yeah, if that was possible, I would certainly do it. We already have
a bunch of C/C++ binaries that we compile and link dynamically, but
for that to be easy to implement, I guess the lists and structures
need to be stored linearly in one consecutive chunk of memory. I
don’t know if that is the case. Anyone? C++ integrations and
gateways are very poorly documented at the moment.
Otherwise, I would need to do some recursive implementation, that
handles a bunch of different object types. Sounds painful.
Cheers,
Arvid
_______________________________________________
users mailing list
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE

Tel:+33 5 61 33 64 59

email : ***@laas.fr
permanent email : ***@polytechnique.org

+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Clément DAVID
2018-10-18 14:06:11 UTC
Permalink
Post by Stéphane Mottelet
Why is the code for structs (lines 242--74)  commented out ? Is it
broken or else ?
If var2vec() / vec2var() could be extended to provide a universal way to serialize / deserialize really any scilab variable, that would be really nice.
Could we make a SEP or fill a bug as a wish ?
Yep I have no problem to allocate a SEP for that, vec2var / var2vec might be used elsewhere but the format need to be documented for all managed types. Handle, for example, will probably not be stored at first. A SEP might even be good to define the real need for that "a Scilab variable dump for cheap and fast access" and to clearly state that this encoding might change across OS, hardware or Scilab versions.

Are you interested in writing it ?

Thanks,
--
Clément
Clément DAVID
2018-10-18 13:51:49 UTC
Permalink
Hello Stephane,

Probably commented out as we have no easy way to extract such data easily using only C constructs (from a Scicos block). It might be possible to uncomment and check the counterpart side (vec2var.cpp) to ensure it works correctly.

Thanks,
--
Clément

-----Original Message-----
From: users <users-***@lists.scilab.org> On Behalf Of Stéphane Mottelet
Sent: Thursday, October 18, 2018 3:01 PM
To: ***@lists.scilab.org
Subject: Re: [Scilab-users] HDF5 save is super slow

Hello again,
Post by Clément David
Hi Antoine,
That one point, vec2var has been defined to pass some datatypes from Scilab "ast" (C++ side, data pointers, refcounted) to Scilab "scicos" (C, raw memory allocated once and passed around). Some data structures might not be handled correctly, I was even surprised that mlists worked correctly.
Scilab Struct (or Cell) are missing as they are more complex datatypes
to serialize. Handle are even harder (as you need to list the
properties somewhere). Feel free to take a look at the code [1],
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/cgit.
scilab.org/scilab/tree/scilab/modules/scicos/src/cpp/var2vec.cpp?h=6.0
#n243
Why is the code for structs (lines 242--74)  commented out ? Is it broken or else ?
Post by Clément David
Cheers,
--
Clément
-----Original Message-----
Sent: Thursday, October 18, 2018 2:47 PM
Subject: Re: [Scilab-users] HDF5 save is super slow
Post by a***@laas.fr
Hello,
My 2cents, this is probably a poor man’s approach but Xcos offers vec2var / var2vec functions that encode in a double vector any Scilab datatypes passed as arguments. The encoding duplicates the data in memory so there might be some overhead.
--> typeof(t)
 ans  =
 st
--> var2vec(t)
var2vec: Wrong type for input argument #1: Double, Integer, Boolean, String or List type.
Arghh... so var2vec does not work for any datatype right?
Antoine
Post by a***@laas.fr
save list of syslins: 1.361704
save list of vec[]: 0.056788
save var2vec(list of syslins): 0.014411
Discarding hdf5 groups creation is a huge performance win but remove any way to create clean hdf5 (eg. to address subgroups directly).
Thanks,
--
Clément
Sent: Tuesday, October 16, 2018 1:01 PM
Subject: Re: [Scilab-users] HDF5 save is super slow
From: users
Users mailing list for Scilab
Date: Tuesday, 16 October 2018 at 09:53
Subject: Re: [Scilab-users] HDF5 save is super slow
Couldn't you create your own atom package that restore this raw memory dump for scilab 6.0?
I understand why we moved away from this model, but it seems to be key for you.
There is always a trade-off between portability (and robustness) and raw speed...
Yeah, if that was possible, I would certainly do it. We already have a bunch of C/C++ binaries that we compile and link dynamically, but for that to be easy to implement, I guess the lists and structures need to be stored linearly in one consecutive chunk of memory. I don’t know if that is the case. Anyone? C++ integrations and gateways are very poorly documented at the moment.
Otherwise, I would need to do some recursive implementation, that handles a bunch of different object types. Sounds painful.
Cheers,
Arvid
_______________________________________________
users mailing list
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists
.scilab.org/mailman/listinfo/users
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la Matière Renouvelable Département Génie des Procédés Industriels Sorbonne Universités - Université de Technologie de Compiègne CS 60319, 60203 Compiègne cedex Tel : +33(0)344234688 http://www.utc.fr/~mottelet

_______________________________________________
users mailing list
***@lists.scilab.org
http://lists.scilab.org/mailman/listinfo/users
Stéphane Mottelet
2018-10-18 13:56:11 UTC
Permalink
Post by Clément DAVID
Hello Stephane,
Probably commented out as we have no easy way to extract such data easily using only C constructs (from a Scicos block). It might be possible to uncomment and check the counterpart side (vec2var.cpp) to ensure it works correctly.
only single structs are supported, likely.

S.
Post by Clément DAVID
Thanks,
--
Clément
-----Original Message-----
Sent: Thursday, October 18, 2018 3:01 PM
Subject: Re: [Scilab-users] HDF5 save is super slow
Hello again,
Post by Clément David
Hi Antoine,
That one point, vec2var has been defined to pass some datatypes from Scilab "ast" (C++ side, data pointers, refcounted) to Scilab "scicos" (C, raw memory allocated once and passed around). Some data structures might not be handled correctly, I was even surprised that mlists worked correctly.
Scilab Struct (or Cell) are missing as they are more complex datatypes
to serialize. Handle are even harder (as you need to list the
properties somewhere). Feel free to take a look at the code [1],
https://antispam.utc.fr/proxy/2/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/cgit.
scilab.org/scilab/tree/scilab/modules/scicos/src/cpp/var2vec.cpp?h=6.0
#n243
Why is the code for structs (lines 242--74)  commented out ? Is it broken or else ?
Post by Clément David
Cheers,
--
Clément
-----Original Message-----
Sent: Thursday, October 18, 2018 2:47 PM
Subject: Re: [Scilab-users] HDF5 save is super slow
Post by a***@laas.fr
Hello,
My 2cents, this is probably a poor man’s approach but Xcos offers vec2var / var2vec functions that encode in a double vector any Scilab datatypes passed as arguments. The encoding duplicates the data in memory so there might be some overhead.
--> typeof(t)
 ans  =
 st
--> var2vec(t)
var2vec: Wrong type for input argument #1: Double, Integer, Boolean, String or List type.
Arghh... so var2vec does not work for any datatype right?
Antoine
Post by a***@laas.fr
save list of syslins: 1.361704
save list of vec[]: 0.056788
save var2vec(list of syslins): 0.014411
Discarding hdf5 groups creation is a huge performance win but remove any way to create clean hdf5 (eg. to address subgroups directly).
Thanks,
--
Clément
Sent: Tuesday, October 16, 2018 1:01 PM
Subject: Re: [Scilab-users] HDF5 save is super slow
From: users
Users mailing list for Scilab
Date: Tuesday, 16 October 2018 at 09:53
Subject: Re: [Scilab-users] HDF5 save is super slow
Couldn't you create your own atom package that restore this raw memory dump for scilab 6.0?
I understand why we moved away from this model, but it seems to be key for you.
There is always a trade-off between portability (and robustness) and raw speed...
Yeah, if that was possible, I would certainly do it. We already have a bunch of C/C++ binaries that we compile and link dynamically, but for that to be easy to implement, I guess the lists and structures need to be stored linearly in one consecutive chunk of memory. I don’t know if that is the case. Anyone? C++ integrations and gateways are very poorly documented at the moment.
Otherwise, I would need to do some recursive implementation, that handles a bunch of different object types. Sounds painful.
Cheers,
Arvid
_______________________________________________
users mailing list
https://antispam.utc.fr/proxy/2/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists
.scilab.org/mailman/listinfo/users
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la Matière Renouvelable Département Génie des Procédés Industriels Sorbonne Universités - Université de Technologie de Compiègne CS 60319, 60203 Compiègne cedex Tel : +33(0)344234688 https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/www.utc.fr/~mottelet
_______________________________________________
users mailing list
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
_______________________________________________
users mailing list
https://antispam.utc.fr/proxy/1/c3RlcGhhbmUubW90dGVsZXRAdXRjLmZy/lists.scilab.org/mailman/listinfo/users
--
Stéphane Mottelet
Ingénieur de recherche
EA 4297 Transformations Intégrées de la Matière Renouvelable
Département Génie des Procédés Industriels
Sorbonne Universités - Université de Technologie de Compiègne
CS 60319, 60203 Compiègne cedex
Tel : +33(0)344234688
http://www.utc.fr/~mottelet
a***@laas.fr
2018-10-15 09:08:04 UTC
Permalink
Hello Arvid,

On m
Post by Arvid Rosén
/////////////////////////////////
N = 4;
n = 10000;
filters = list();
for i=1:n
  G=syslin('c', rand(N,N), rand(N,1), rand(1,N), rand(1,1));
  filters($+1) = G;
end
tic();
save('filters.dat', filters);
ts1 = toc();
tic();
save('filters.dat', 'filters');
ts2 = toc();
printf("old save %.2fs\n", ts1);
printf("new save %.2fs\n", ts2);
printf("slowdown %.1f\n", ts2/ts1);
/////////////////////////////////
--
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

Antoine Monmayrant LAAS - CNRS
7 avenue du Colonel Roche
BP 54200
31031 TOULOUSE Cedex 4
FRANCE

Tel:+33 5 61 33 64 59

email : ***@laas.fr
permanent email : ***@polytechnique.org

+++++++++++++++++++++++++++++++++++++++++++++++++++++++
Loading...