1 set i /i1*i300/;
2
3 alias (i,j,k);
4
5 parameter a(i,j,k) /
6 i1.i1.i1 1
7 /;
8
9 parameter b(i);
10 b(i) = 0.5;
11
12 * slow division
13 parameter c0(i,j,k);
14 c0(i,j,k) = a(i,j,k)/b(i);
15
16 * fast division (method 1)
17 parameter c1(i,j,k);
18 c1(i,j,k)$a(i,j,k) = a(i,j,k)/b(i);
19
20 * fast division (method 2)
21 parameter b2(i);
22 b2(i) = 1/b(i);
23 parameter c2(i,j,k);
24 c2(i,j,k) = a(i,j,k)*b2(i);
The profile information shows:
---- 1 ExecInit 0.000 0.000 SECS 3 Mb
---- 10 Assignment b 0.000 0.000 SECS 4 Mb 300
---- 14 Assignment c0 3.292 3.292 SECS 4 Mb 1
---- 18 Assignment c1 0.000 3.292 SECS 4 Mb 1
---- 22 Assignment b2 0.000 3.292 SECS 4 Mb 300
---- 24 Assignment c2 0.000 3.292 SECS 4 Mb 1
I.e. the original assignment c0 is slow, but assignments c1 and c2 are fast. The first assignment to c0 is running over all (i,j,k). In assignment c1 we force to only consider the single nonzero element in a(i,j,k). In assignment c2 we multiply by the reciprocal. Multiplication is fast as it will skip any nonzero a(i,j,k).
No comments:
Post a Comment