2018-08-26

CAF 先着４image 様まで

CoArray Fortranで先着幾名かまで実行

atomic_ref と atomic_define で何とかなるかとも思いましたが、atomic な読み出しと書き込みの間隙をぬって、他の image が入り込む可能性があるので、critical .... end critical 構文でやる必要がある気がします。

Fortran2018 では atomic_fetch_add 関数が導入されるので、これが使えるようになれば楽になろうかともいます。

critical 構文と atomic サブルーチン

critical .... end critical 構文と atomic サブルーチンはどちらも排他的実行に関わるものです。これらの違いはというと、critical 構文は critical と end critical で挟まれた一連の命令文の排他的実行であり、atomic サブルーチンは coarray 変数に対する排他的操作になっています。

つまり critical の方が文の排他実行で抽象度が高いのに対して、atomic サブルーチンは単一の coarray 変数の値に対する低レベルな排他的な読み書きになっています。

Fortran 2008 の段階では、本当に読み出しないし書き込みしか出来ないため、利用しづらいものがあります。そのためか、Fortran2018 では、以下のように仲間が増えて拡充されます。

call atomic add ( atom, value[ , stat] )

call atomic and ( atom, value[ , stat] )

call atomic or ( atom, value[ , stat] )

call atomic xor ( atom, value[ , stat] )

call atomic fetch add ( atom, value, old[ , stat] )

call atomic fetch and ( atom, value, old[ , stat] )

call atomic fetch or ( atom, value, old[ , stat] )

call atomic fetch xor ( atom, value, old[ , stat] )

call atomic cas ( atom, old, compare, new[ , stat] )

N2161 The New Features of Fortran 2018 (Reid - Replaces N2145) §3.20 New and enhanced atomic subroutines

Modern Fortran Explained: Incorporating Fortran 2018 (Numerical Mathematics and Scientific Computation)

作者: Michael Metcalf,John Reid,Malcolm Cohen
出版社/メーカー: Oxford Univ Pr
発売日: 2018/11/06
メディア: ハードカバー
この商品を含むブログを見る

プログラム

nr[1] を特別扱いして、カウンタとします。変数 k には先着順が入ります。

    program caf04
        implicit none
        integer :: ne, me, k, nr[*]
        real :: x
        ne = num_images()
        me = this_image()
        if (me == 1) nr = 1
        sync all
        x = sqrt(real( me ))
        
        critical
            k = nr[1]  
            nr[1] = nr[1] + 1  
        end critical     
      
        if (k <= 4) print *, k, ':', me, '/', ne, x
    end program caf04

将来：

    program caf04
        use, intrinsic :: iso_fortran_env
        implicit none
        integer(atomic_int_kind) :: k, nr[*]
        integer :: ne, me
        real :: x
        ne = num_images()
        me = this_image()
        if (me == 1) nr = 1
        sync all
        x = sqrt(real( me ))
        
       call atomic_fetch_add(nr[1], 1, k)  ! 多分
      
        if (k <= 4) print *, k, ':', me, '/', ne, x
    end program caf04

実行結果

           1 :           1 /           8   1.000000
           2 :           5 /           8   2.236068
           4 :           7 /           8   2.645751
           3 :           6 /           8   2.449490
続行するには何かキーを押してください . . .

Parallel Programming with Co-arrays (Chapman & Hall/CRC Computational Science)

作者: Robert W. Numrich
出版社/メーカー: Chapman and Hall/CRC
発売日: 2018/09/24
メディア: ハードカバー
この商品を含むブログを見る

2018-08-25

【メモ帳】イギリス　スパコンセンター　講習資料ページ

Archer の講義・講演ページ

シンボルマークが弓兵・射手というより的というかんじで、アメリカのスーパーのターゲットのようです。

www.archer.ac.uk

2018-08-23

【メモ帳】CAF 　縁交換

CAF (CoArray Fortran) で Halo Exchange

CAF (CoArray Fortran) で境界のヘリの交換（Halo Exchange）をやってみました。もしくは後光の交換。

Halo のイメージ　なんとなくビザンチンのモザイク画
f:id:fortran66:20180823110712j:plain

Parallel Programming with Co-arrays (Chapman & Hall/CRC Computational Science)

作者: Robert W. Numrich
出版社/メーカー: Chapman and Hall/CRC
発売日: 2018/09/24
メディア: ハードカバー
この商品を含むブログを見る

10世紀末頃ビザンチンで編纂された百科全書的辞書。『スーダ』 Suda
Stoa | Welcome to the Suda On Line (SOL)

ソース・プログラム１

Intel Fortran Ver.18
コア数９個に設定 3*3 の coshape とした。上下方向には周期境界条件を、左右方向は普通の境界を設定した。各イメージ固有の配列サイズは 2*2 として、その周囲を囲むように幅 1 の縁を置いた。

以下のプログラムでは、各列ないし各行毎にシリアルな順序つき実行になっている。

    program CAF001
        implicit none
        real, allocatable :: a(:, :)[:, :]
        integer :: me, ne, nx = 2, ny = 2
        integer :: ix, iy, ipos(2)
        
        ne = num_images()
        me = this_image()
 
        allocate(a(0:nx + 1, 0:ny + 1)[3, *])
        ix = this_image(a, 1)
        iy = this_image(a, 2)
        !
        ! initial value
        a = me
        a(:, 0     ) =  0.0
        a(:, ny + 1) = 50.0
        !
        ! Halo exchange
        ! up <--> down 
        sync all
        if (ix < ucobound(a, 1)) then
            sync images ( image_index(a, [ix + 1, iy]) )
            a(0     , :)[ix + 1, iy] = a(ny, :)
        end if
        if (lcobound(a, 1) < ix) then
            a(ny + 1, :)[ix - 1, iy] = a( 1, :)
            sync images ( image_index(a, [ix - 1, iy]) )
        end if
        ! periodic boundary condition
        if (ix == ucobound(a, 1)) then 
            sync images ( image_index(a, [lcobound(a, 1), iy]) ) 
            a(0     , :)[lcobound(a, 1), iy] = a(ny, :)
        end if    
        if (ix == lcobound(a, 1)) then 
            a(ny + 1, :)[ucobound(a, 1), iy] = a( 1, :)
            sync images ( image_index(a, [ucobound(a, 1), iy]) ) 
        end if    
!
        ! left <--> right
        sync all
        if (iy < ucobound(a, 2)) then
            sync images ( image_index(a, [ix, iy + 1]) )
            a(:,      0)[ix, iy + 1] = a(:, ny)
        end if
        if (lcobound(a, 2) < iy) then
            a(:, ny + 1)[ix, iy - 1] = a(:,  1)
            sync images ( image_index(a, [ix, iy - 1]) )
        end if
        
        ! ordered output 1..ne
        if ( 1 < me) sync images (me - 1)
        print *, 'image=', me
        print '(4f5.1)', transpose(a)
        if (me < ne) sync images (me + 1)
    end program CAF001

個々の配列要素の重なりさえ避けられれば、基本的に sync はあまり必要ない。しかし要所要所で sync all することになるので、同期がモッサリする。

Modern Fortran Explained: Incorporating Fortran 2018 (Numerical Mathematics and Scientific Computation)

作者: Michael Metcalf,John Reid,Malcolm Cohen
出版社/メーカー: Oxford Univ Pr
発売日: 2018/11/06
メディア: ハードカバー
この商品を含むブログを見る

ソース・プログラム２

細かい sync をせず、区切りごとに sync all した場合。

どうするのが一番いいのか、よく分からない。coarray 代入には PUT 形式の方が GET 形式より、一般に性能が出やすいらしいが･･･　(PUT形式: a[i] = b, GET形式: a = b[j])

    program CAF001
        implicit none
        real, allocatable :: a(:, :)[:, :]
        integer :: me, ne, nx = 2, ny = 2
        integer :: ix, iy, ix1, ix2, iy1, iy2
        
        ne = num_images()
        me = this_image()
        print *, me, '/', ne
        
        allocate( a(0:nx + 1, 0:ny + 1)[3, *] )
        !
        ix = this_image(a, 1) 
        iy = this_image(a, 2)
        !
        ! periodic configuration
        !
        if (ix == lcobound(a, 1)) then 
            ix1 = ucobound(a, 1)
        else 
            ix1 = ix - 1
        end if    
        if (ix == ucobound(a, 1)) then 
            ix2 = lcobound(a, 1)
        else 
            ix2 = ix + 1
        end if    
        if (iy == lcobound(a, 2)) then 
            iy1 = ucobound(a, 2)
        else 
            iy1 = iy - 1
        end if    
        if (iy == ucobound(a, 2)) then 
            iy2 = lcobound(a, 2)
        else 
            iy2 = iy + 1
        end if    
        !
        ! initial value
        !
        a = me
        a(:, 0     ) =  0.0
        a(:, ny + 1) = 50.0
        !
        ! Halo exchange
        !
        ! up <--> down :  periodic boundary condition
        sync all
        a(nx + 1, :)[ix1, iy] = a( 1, :)
        a(0     , :)[ix2, iy] = a(nx, :)
        !
        ! left <--> right
        sync all
        if ( iy > lcobound(a, 2) ) a(:, ny + 1)[ix, iy1] = a(:,  1)
        if ( iy < ucobound(a, 2) ) a(:,      0)[ix, iy2] = a(:, ny)
        !
        ! ordered output 1..ne
        !
        sync all
        if ( 1 < me) sync images (me - 1)
        print *, 'image=', me
        print '(4f5.1)', transpose(a)
        if (me < ne) sync images (me + 1)
    end program CAF001

実行結果

イメージ番号配置（処理系により勝手に設定される？）
1 4 7
2 5 8
3 6 9

上下方向には周期境界条件。左右方向は普通の境界。

 4 4 : 3 3
 4 4 : 3 3
 4 4 : 3 3
 4 4 : 3 3
 4 4 : 3 3
 4 4 : 3 3
 4 4 : 3 3
 4 4 : 3 3
 4 4 : 3 3
 image= 1
  0.0  3.0  3.0  6.0
  0.0  1.0  1.0  4.0
  0.0  1.0  1.0  4.0
  0.0  2.0  2.0  5.0
 image= 2
  0.0  1.0  1.0  4.0
  0.0  2.0  2.0  5.0
  0.0  2.0  2.0  5.0
  0.0  3.0  3.0  6.0
 image= 3
  0.0  2.0  2.0  5.0
  0.0  3.0  3.0  6.0
  0.0  3.0  3.0  6.0
  0.0  1.0  1.0  4.0
 image= 4
  3.0  6.0  6.0  9.0
  1.0  4.0  4.0  7.0
  1.0  4.0  4.0  7.0
  2.0  5.0  5.0  8.0
 image= 5
  1.0  4.0  4.0  7.0
  2.0  5.0  5.0  8.0
  2.0  5.0  5.0  8.0
  3.0  6.0  6.0  9.0
 image= 6
  2.0  5.0  5.0  8.0
  3.0  6.0  6.0  9.0
  3.0  6.0  6.0  9.0
  1.0  4.0  4.0  7.0
 image= 7
  6.0  9.0  9.0 50.0
  4.0  7.0  7.0 50.0
  4.0  7.0  7.0 50.0
  5.0  8.0  8.0 50.0
 image= 8
  4.0  7.0  7.0 50.0
  5.0  8.0  8.0 50.0
  5.0  8.0  8.0 50.0
  6.0  9.0  9.0 50.0
 image= 9
  5.0  8.0  8.0 50.0
  6.0  9.0  9.0 50.0
  6.0  9.0  9.0 50.0
  4.0  7.0  7.0 50.0

2018-08-21

【メモ帳】遠雷

英 NAG Fortran 2008 まとめ

簡潔にまとまっています。

Fortran 2008 Overview

Fortran 2008 Overview www.nag.co.uk

特に coarray 部分
NAG Fortran Compiler, Release 6.2: SPMD programming with coarrays [6.2] www.nag.co.uk

Steve Lionel 氏の twitter より二題

John Reid 氏「The New Features of Fortran 2018」最新版

N2161 The New Features of Fortran 2018 (Reid - Replaces N2145)
wg5-fortran.org

Modern Fortran Explained: Incorporating Fortran 2018 (Numerical Mathematics and Scientific Computation)

作者: Michael Metcalf,John Reid,Malcolm Cohen
出版社/メーカー: Oxford Univ Pr
発売日: 2018/11/06
メディア: ハードカバー
この商品を含むブログを見る

Damian Rouson 氏の「Writing Fortran 2018 Today: Object-Oriented Parallel Programming」

www.eventbrite.com

and other open-source tools that will be employed in the course:

Distributed source mangement with git and GitHub.

Cross-platform building with CMake.

Unit testing with CTest.

Automatic documentation generation with FORD.

Reverse engineering of OOD diagrams with ForUML.

Parallel performance engineering with TAU.

数年前から Fortran 界にも、遠くの雷鳴の如く、Unit Test や Document 自動生成の話題が出ていましたが、いよいよ近づいてきたような気がします。

Scientific Software Design: The Object-Oriented Way

作者: Damian Rouson,Jim Xia,Xiaofeng Xu
出版社/メーカー: Cambridge University Press
発売日: 2011/04/29
メディア: Kindle版
この商品を含むブログを見る

各々三行くらいに噛み砕いてお話してくれる紙芝居（Power Point）屋のおじちゃんが現れないかな。

「CAFe: Coarray Fortran Extensions for Heterogeneous Computing」

CAFe: Coarray Fortran Extensions for Heterogeneous Computing - IEEE Conference Publication

CAF は各プロセッサが等質な PGAS なので、Heterogeneous な計算機には向かないと思って、前には実にしょーもないと思っていたけれど･･･　次期規格での team が導入されると、各々の計算装置を束ねて、計算は各々 team として一括で扱え、律速のメモリー移動を明示的に人間が指示できる Coarray Fortran との組み合わせは意外にいいアイデアなのではないかという気がしてきたｗ

CoArray の基本思想 by R. Numrich (2001)

昔のスライドですが・・・
f:id:fortran66:20180821233115p:plain
http://charm.cs.uiuc.edu/kale/teaching/cs320/CAFfromupcsc01CarlsonB.pdf

The Co-Array Fortran Philosophy

What is the smallest change required to make

Fortran 90 an effective parallel language?

How can this change be expressed so that it is

intuitive and natural for Fortran programmers to
understand?

How can it be expressed so that existing compiler

technology can implement it efficiently?

並列化へ文法の最小限度の変更、Fortran 利用者の負担も最小限度、効率よい最適化（コンパイラの負担も最小限度）

John Mellor-Crummey 氏「From HPF to Coarray Fortran 2.0」(2013)

http://labexcompilation.ens-lyon.fr/wp-content/uploads/2013/02/John.pdf

f:id:fortran66:20180822011926p:plain

2018-08-18

【メモ帳】

Fortran の特徴

The Fortran standards put emphasis on: Performance: many opportunities for optimisation Note that this puts some burden on the programmer too!

Type-safety: sometimes the standard is stricter than you find in other languages　　

f:id:fortran66:20180818175552p:plain

tcevents.chem.uzh.ch

【ニュース】Arjen Markus 氏の modern Fortran 講義その他 - fortran66のブログ

Arjen Boin

f:id:fortran66:20180818183626j:plain
【衝撃】ありえんボイン、現るｗｗｗｗｗ: GOSSIP速報

ARMいわく「次世代プロセッサーでインテルを追い抜く」。2020年にはノートPCクラスの処理速度を獲得へ

ほんまかいな

japanese.engadget.com

歴史的には、安くて数の多い「と金」のようなものがのし上がって勝ってきた点に鑑みて intel 落ち目だしありえなくもない。が、ARM には禿の詐欺師まがいの祟りがありそう。

それはともかく Raspberry Pi 3 は 64bit Linux が動くようになったらしく、ARM 謹製 Fortran 稼働の条件を満たしたような気もするので、特攻隊の方が試しに行ってほしい所。

またも 4chan QAnon

gendai.ismedia.jp

www.youtube.com

2018-08-13

Cray Chapel から X-window に点を打つ

Chapel から bash on windows で X-window 利用

Julia 言語が Ver.1.0 になったとかで、記念にマイナー言語の Chapel をいじります。Chapel は coarray Fortran と同じく PGAS 言語です。同じ PGAS 仲間のよしみでいじります。

Chapel には、最近は Rust 言語のように transaction 的なメモリー管理も導入されているようですが、よく分かりません。

Chapel: Productive Parallel Programming

C言語との混成もできるようになったようなので、X-Window で図形を描くことに挑戦してみました。

Fortran の時と同じようにやってみました。ただ乱数ルーチンを調べるのが面倒なので、丸を書いてお茶を濁しました。
fortran66.hatenablog.com

コンパイル＆ゴー

まず、gcc で C 言語のルーチンをコンパイルしてオブジェクトを生成しておきます。

それを chapel に渡します、Xwindow のライブラリなどを認識させるためリンカ―・オプション -lX11 をつけます。警告メッセージが沢山出ますが何とかなります。

実行は、マルチロケール対応版にしてあるので、ロケール数を指定してやります。（-nl 1 は１個）

O@HP8:~$ gfortran -c plot3.c
O@HP8:~$ chpl  plot3.o  plot3.chpl  -lX11
In file included from /tmp/chpl-O-4402.deleteme/_main.c:43:0:
/tmp/chpl-O-4402.deleteme/plot.c: In function ‘chpl_user_main’:
/tmp/chpl-O-4402.deleteme/plot.c:32:1: warning: implicit declaration of function ‘X_open’; did you mean ‘open’? [-Wimplicit-function-declaration]
 X_open(((int32_t)(INT64(800))), ((int32_t)(INT64(600))));
 ^~~~~~
 open
/tmp/chpl-O-4402.deleteme/plot.c:42:1: warning: implicit declaration of function ‘X_point’; did you mean ‘isprint’? [-Wimplicit-function-declaration]
 X_point(call_tmp_chpl4, ((int32_t)((INT32(300) + i_chpl))));
 ^~~~~~~
 isprint
/tmp/chpl-O-4402.deleteme/plot.c:59:1: warning: implicit declaration of function ‘X_close’; did you mean ‘pclose’? [-Wimplicit-function-declaration]
 X_close();
 ^~~~~~~
 pclose
O@HP8:~$ ./plot3 -nl 1
X-window plot

1
O@HP8:~$

コンパイルは遅いです。

出力図

f:id:fortran66:20180813025648p:plain

ソース・プログラム

C　plot3.c

窓が立ち上がるまで１秒ほど時間をおいています。

#include <X11/Xlib.h>
#include <X11/Xutil.h>
#include <unistd.h>
// unistd.h -> sleep()
static Display* d;
static Window   w;
static GC       gc;
unsigned long white, black;

void X_open(int nx, int ny){
    // Open a display.
    d = XOpenDisplay(0);
    if ( !d ) return;
    //
    white = WhitePixel(d, DefaultScreen(d));
    black = BlackPixel(d, DefaultScreen(d));
    // Create the window
    w = XCreateSimpleWindow(d, DefaultRootWindow(d), 0, 0, nx, ny, 0, black, white);
    XMapWindow(d, w);
    gc = XCreateGC(d, w, 0, 0);
    XFlush(d);
    sleep(1);
}

void X_point(int ix, int iy){
     XDrawPoint(d, w, gc, ix, iy);
     XFlush(d);
}

void X_flush(){
     XFlush(d);
}

void X_close(void){
    XFreeGC(d, gc);
    XDestroyWindow(d, w);
    XFlush(d);
    XCloseDisplay(d);
}

chapel　plot3.chpl

デフォルトの整数型が 64bit 整数のため、X-windowの 32bit 引数と合わず、型変換がうるさくなりましたが、無視しても大丈夫な模様です。コンソールからの入力は例外処理を強要させれるのですが、適当にごまかしました。

extern proc X_open(nx: int(32), ny: int(32));
extern proc X_point(ix:int(32), iy: int(32));
extern proc X_close();

module plot {

    proc main() {
        var i, ix, iy : int(32);
        var x, y: real;
        X_open(800, 600);

        writeln("X-window plot");
        for i in 0:int(32)..100:int(32)
        {
            y = i:real;
            x = sqrt(100.0**2-y**2);
            ix = 400:int(32) + x:int(32);
            iy = 300:int(32) + i;
            X_point(ix, iy);
            iy = 300:int(32) - i;
            X_point(ix, iy);
            ix = 400:int(32) - x:int(32);
            iy = 300:int(32) + i;
            X_point(ix, iy);
            iy = 300:int(32) - i;
            X_point(ix, iy);
        }

        try! {i = read(int(32));} // pause
        X_close();
    }
}

2018-08-12

【メモ帳】制限付き分割 p(n,5)

Andrews & Eriksson 「整数の分割」演習 88

だいぶ昔に読んでたアンドリューズとエリクソンの「整数の分割」の演習問題 88. が面倒くさかったので飛ばしていたのですが、夏休みなので暇つぶしに Maple/Mathematica の手を借りてやってみました。すぐ忘れるのでメモしておきます。

整数の分割

作者: ジョージ・アンドリュース,キムモ・エリクソン,佐藤文広
出版社/メーカー: 数学書房
発売日: 2006/05
メディア: 単行本
この商品を含むブログを見る

きんもーっ☆・エリクソン

Integer Partitions

作者: George E. Andrews,Kimmo Eriksson
出版社/メーカー: Cambridge University Press
発売日: 2004/10/11
メディア: ペーパーバック
この商品を含むブログを見る

演習 88.

$p(n, 5)=\{{1\over2880}(n+8)(n^3+22n^2+44n+248+180\lfloor {n\over2}\rfloor \}$ であることを証明せよ。

（ここで p(n,5) は整数ｎを１から５までの数に分割するときの組み合わせの個数、中括弧は四捨五入、 $\lfloor x\rfloor$ は床関数になってます。）

例 p(n, 4)、 4 までの制限つき分割

-n=1　p(1, 4)=1 
O
-n=2　p(2 ,4)=2
O, OO
O
-n=3　p(3, 4)=3
O, OO, OOO
O, O
O
-n=4　p(4, 4)=5
O, OO, OO, OOO, OOOO
O, O,  OO, O
O, O
O
-n=5　p(5, 4)=6
O, OO, OO, OOO, OOO, OOOO
O, O,  OO, O,   OO,  O
O, O,  O,  O
O, O
O

導出

$\begin{align} p(n, 5)&=\prod_{k=1}^5{1\over(1-x^k)}\\ &={1\over120}{1\over(1-x)^5}+{1\over24}{1\over(1-x)^4}+{31\over288}{1\over(1-x)^3}+{3\over16}{1\over(1-x)^2}+{1\over64}{1\over(1+x)^2}\\ &+{20831\over86400}{1\over1-x}+{13\over128}{1\over1+x}+{1\over27}{x+2\over1+x+x^2}+{1\over25}{4+3x+2x^2+x^3\over1+x+x^2+x^3+x^4}\\ &={1\over120}{1\over(1-x)^5}+{1\over24}{1\over(1-x)^4}+{31\over288}{1\over(1-x)^3}+{3\over16}{1\over(1-x)^2}+{1\over64}{1\over(1+x)^2}\\ &+{1\over16}{1\over1-x}+{13\over128}({1\over1-x}+{1\over1+x})+{1\over27}({1\over1-x}+{x+2\over1+x+x^2})+ {1\over5}({1\over1-x}+{4+3x+2x^2+x^3\over1+x+x^2+x^3+x^4})\\ &={1\over120}{1\over(1-x)^5}+{1\over24}{1\over(1-x)^4}+{31\over288}{1\over(1-x)^3}+{3\over16}{1\over(1-x)^2}+{1\over64}{1\over(1+x)^2}\\ &+{1\over16}{1\over1-x}+{13\over64}{1\over1-x^2}+{1\over9}{1\over1-x^3}+{1\over5}{1\over1-x^5}\\ \end{align}$

部分分数に展開の上で、いくつかの項をまとめた。（部分分数展開 Maple convert(f, parfrac), Mathematica Apart[f]）　

級数に展開すると
$\begin{align} p(n,5)&=\sum_n\left({1\over120}\binom{n+4}{4}+{1\over24}\binom{n+3}{3}+{31\over288}\binom{n+2}{2}+{3\over16}\binom{n+1}{1}+{(-1)^n\over64}\binom{n+1}{1}+{1\over16} \right)x^n \\ &+\sum_n\left({13\over64}x^{2n}+{1\over9}x^{3n}+{1\over5}x^{5n}\right)\\ \end{align}$
ここで、二段目の項は区間 [0, 14/45] に収まるので、とりあえず無視しておく。

１段目の項をさらに変形すると、
$\begin{align} &={1\over2880} \left( n^4+30n^3+310n^2+1275n+1124 \right)+{45\over2880}(-1)^n\left((n+1)+7\right)\\ &-{45\over2880}(-1)^n7\\ &={1\over2880} \left( n^4+30n^3+310n^2+1230n+764+45( (-1)^n (n+8)+(n+8) )\right)\\ &-{45\over2880}(-1)^n7\\ \end{align}$
ここで、因数分解の係数を合わせるために（答えから逆算して）余分な項を付け足し、二段目でそれをキャンセルさせている。この項は先ほど無視することにした項と合わせて値は区間 [-7/64, 99/320] になるが、四捨五入に影響しないので無視できる。また前の方の項から値を借りてきて後ろの項を整えている。

ここで、
$(-1)^n+1=2\left(2\lfloor{n\over2}\rfloor-n+1\right)$
であるので、これを用いて一段目の式を変形すると、
$\begin{align} &={1\over2880}\left(n^4+30n^3+310n^2+1230n+764+90(n+8)(2\lfloor{n\over2}\rfloor-n+1)\right)\\ &={1\over2880}\left(n^4+30n^3+220n^2+600n+1484+(n+8)180\lfloor{n\over2}\rfloor\right)\\ &={1\over2880}\left((n+8)(n^3+22n^2+44n+248+180\lfloor{n\over2}\rfloor)-500\right)\\ \end{align}$
ここで、 $-{500\over2880}=-{25\over144}$ の項は因数分解の都合合わせで出る余りで、今までの半端な項と合わせてやって区間 [-163/576,391/2880] となって絶対値は 1/2 に満たないので四捨五入に影響しない。よって無視できる。

以上のことから
$p(n,5)=\left\{ {1\over2880}(n+8)\left( n^3+22n^2+44n+248+180\lfloor{n\over2}\rfloor\right) \right\}$
であることが示せた。

無視できるカスの項は適当な計算なので間違ってたらゴメンｗ

その他

演習87. $p(n, 4)=\left\{ {1\over144}(n+5)(n^2+n+22+18\lfloor{n\over2}\rfloor) \right\}$ は、上記と同じ方法で求まる。

演習86. $p(n, 4)=\left\{ {1\over36}\lfloor{n+4\over2}\rfloor^2(3\lfloor{n+9\over2}\rfloor-\lfloor{n+10\over2}\rfloor)\right\}$ は等式変形としては求められるが、何か求め方があってこの形になっているのではないかと思われるがよく分からんｗ

いじっているうちに、副産物として $p(n, 4)=\left\{ {1\over36}\lfloor{n+7\over2}\rfloor^2(3\lfloor{n+2\over2}\rfloor-\lfloor{n+1\over2}\rfloor) \right\}$ が出てきた。こっちの方があぶれるカス項が小さい。

Fortran にて

４までの分割は、３までの分割の和で表わせるので、それをプログラムする。

４段のヤング図の４段目を０から１個づつ増やして行くと、残りの n-4i 個を３段のヤング図にするだけの可能性がある。その総和を求めれば p(n, 4)。なお p(n,3)=round((n+3)^2/12)。これは上記の論法を３段と２段でやれば求まる。２段は１段とやればよい。

この方法で演習86. が出来ないかと思ったが、うまくいかなかったｗ

Modern Fortran Explained: Incorporating Fortran 2018 (Numerical Mathematics and Scientific Computation)

作者: Michael Metcalf,John Reid,Malcolm Cohen
出版社/メーカー: Oxford Univ Pr
発売日: 2018/11/06
メディア: ハードカバー
この商品を含むブログを見る

ソース・プログラム

    module m_mod
        implicit none
    contains
        integer function p4(n) 
            integer, intent(in) :: n
            integer :: i
            p4 = 0
            do i = 0, n / 4
                p4 = p4 + p3(n - 4 * i) 
            end do
        end function p4
        
        integer function p3(n)
            integer, intent(in) :: n
            p3 = nint((n + 3)**2 / 12.0)        
        end function p3
    end module m_mod
        
    program partition4
      use m_mod
      implicit none
      integer :: i
      do i = 1, 10
          print *, i, p4(i)
      end do 
    end program partition4

実行結果

           1           1
           2           2
           3           3
           4           5
           5           6
           6           9
           7          11
           8          15
           9          18
          10          23
続行するには何かキーを押してください . . .

CoArray Fortranで先着幾名かまで実行

critical 構文と atomic サブルーチン

プログラム

実行結果

Archer の講義・講演ページ

CAF (CoArray Fortran) で Halo Exchange

ソース・プログラム１

ソース・プログラム２

実行結果

英 NAG Fortran 2008 まとめ

John Reid 氏 「The New Features of Fortran 2018」最新版

Damian Rouson 氏の「Writing Fortran 2018 Today: Object-Oriented Parallel Programming」

「CAFe: Coarray Fortran Extensions for Heterogeneous Computing」

CoArray の基本思想 by R. Numrich (2001)

John Mellor-Crummey 氏「From HPF to Coarray Fortran 2.0」(2013)

Fortran の特徴

Arjen Boin

ARMいわく「次世代プロセッサーでインテルを追い抜く」。2020年にはノートPCクラスの処理速度を獲得へ

またも 4chan QAnon

Chapel から bash on windows で X-window 利用

コンパイル＆ゴー

出力図

ソース・プログラム

C plot3.c

chapel plot3.chpl

Andrews & Eriksson 「整数の分割」演習 88

演習 88.

例 p(n, 4)、 4 までの制限つき分割

導出

その他

Fortran にて

ソース・プログラム

実行結果

John Reid 氏「The New Features of Fortran 2018」最新版

C　plot3.c

chapel　plot3.chpl