Iozone on multiple nodes using ssh

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Iozone on multiple nodes using ssh

Chuanwen Wu
Hi,
I am trying to use iozone to test my cluster with ssh but not rsh, but
I still can't running iozone on multiple nodes.

I have two nodes node73 and node74 in my LAN. And each node is able to
execute commands on another one without being challenged for a
password with user "dnfs":
dnfs@node73 ~ $ ssh node74 date
Thu Jan  8 22:44:55 CST 2009

dnfs@Gnode74 ~ $ ssh node73 date
Thu Jan  8 22:47:19 CST 2009

Iozone use rsh to execute commands on the clients, and now I use ssh
to replace rsh:
dnfs@node73 ~ $ cat ~/.bashrc
# /etc/skel/.bashrc
[...]
# Put your fun stuff here.
export RSH=ssh

Here is my clientlist, which contains the nodes' information:
dnfs@node73 ~ $ cat clientlist
node74 /home/dnfs /tmp/iozone

Then I run iozone on node73:
/***********************************************************/
dnfs@node73 ~ $ iozone -s 1m -Rb log.xls  -t 1 -+m clientlist
        Iozone: Performance Test of File I/O
                Version $Revision: 3.242 $
                Compiled for 64 bit mode.
                Build: linux-AMD64

        Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
                     Al Slater, Scott Rhine, Mike Wisner, Ken Goss
                     Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
                     Randy Dunlap, Mark Montague, Dan Million,
                     Jean-Marc Zucconi, Jeff Blomberg,
                     Erik Habbinga, Kris Strecker, Walter Wong.

        Run began: Thu Jan  8 23:11:13 2009

        File size set to 1024 KB
        Excel chart generation enabled
        Network distribution mode enabled.
        Command line used: iozone -s 1m -Rb log.xls -t 1 -+m clientlist
        Output is in Kbytes/sec
        Time Resolution = 0.000001 seconds.
        Processor cache size set to 1024 Kbytes.
        Processor cache line size set to 32 bytes.
        File stride size set to 17 * record size.
        Throughput test with 1 process
        Each process writes a 1024 Kbyte file in 4 Kbyte records
/******************************************************/
Then iozone stoped here.

I still use strace to see what happend:
/******************************************************/
dnfs@node73 ~ $ strace iozone -s 1m -Rb log.xls  -t 1 -+m clientlist
execve("/usr/bin/iozone", ["iozone", "-s", "1m", "-Rb", "log.xls",
"-t", "1", "-+m", "clientlist"], [/* 45 vars */]) = 0
brk(0)                                  = 0x7cb000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7f67f3639000
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7f67f3638000
access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=48606, ...}) = 0
mmap(NULL, 48606, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f67f362c000
close(3)                                = 0
open("/lib/librt.so.1", O_RDONLY)       = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300\"\0\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=35688, ...}) = 0
mmap(NULL, 2132968, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f67f3217000
mprotect(0x7f67f321f000, 2093056, PROT_NONE) = 0
mmap(0x7f67f341e000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7000) = 0x7f67f341e000
close(3)                                = 0
open("/lib/libpthread.so.0", O_RDONLY)  = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240W\0\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=131577, ...}) = 0
mmap(NULL, 2204528, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f67f2ffc000
mprotect(0x7f67f3011000, 2097152, PROT_NONE) = 0
mmap(0x7f67f3211000, 8192, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x15000) = 0x7f67f3211000
mmap(0x7f67f3213000, 13168, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f67f3213000
close(3)                                = 0
open("/lib/libc.so.6", O_RDONLY)        = 3
read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\334\1\0\0\0\0\0"...,
832) = 832
fstat(3, {st_mode=S_IFREG|0755, st_size=1293456, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7f67f362b000
mmap(NULL, 3399928, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
0) = 0x7f67f2cbd000
mprotect(0x7f67f2df3000, 2093056, PROT_NONE) = 0
mmap(0x7f67f2ff2000, 20480, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x135000) = 0x7f67f2ff2000
mmap(0x7f67f2ff7000, 16632, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f67f2ff7000
close(3)                                = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7f67f362a000
arch_prctl(ARCH_SET_FS, 0x7f67f362a6f0) = 0
mprotect(0x7f67f2ff2000, 16384, PROT_READ) = 0
mprotect(0x7f67f3211000, 4096, PROT_READ) = 0
mprotect(0x7f67f341e000, 4096, PROT_READ) = 0
mprotect(0x62a000, 4096, PROT_READ)     = 0
mprotect(0x7f67f363a000, 4096, PROT_READ) = 0
munmap(0x7f67f362c000, 48606)           = 0
set_tid_address(0x7f67f362a780)         = 31886
set_robust_list(0x7f67f362a790, 0x18)   = 0
rt_sigaction(SIGRTMIN, {0x7f67f3001310, [], SA_RESTORER|SA_SIGINFO,
0x7f67f3009ec0}, NULL, 8) = 0
rt_sigaction(SIGRT_1, {0x7f67f3001390, [],
SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7f67f3009ec0}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
uname({sys="Linux", node="Gentoo-F312-73", ...}) = 0
brk(0)                                  = 0x7cb000
brk(0x7ec000)                           = 0x7ec000
open("/etc/localtime", O_RDONLY)        = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=405, ...}) = 0
fstat(3, {st_mode=S_IFREG|0644, st_size=405, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7f67f3637000
read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0\3\0\0\0\0"...,
4096) = 405
lseek(3, -240, SEEK_CUR)                = 165
read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0\3\0\0\0\0"...,
4096) = 240
close(3)                                = 0
munmap(0x7f67f3637000, 4096)            = 0
rt_sigaction(SIGINT, {0x408b2c, [INT], SA_RESTORER|SA_RESTART,
0x7f67f2ced430}, {SIG_DFL}, 8) = 0
rt_sigaction(SIGTERM, {0x408b2c, [TERM], SA_RESTORER|SA_RESTART,
0x7f67f2ced430}, {SIG_DFL}, 8) = 0
mmap(NULL, 18878464, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
-1, 0) = 0x7f67f1abc000
open("clientlist", O_RDONLY)            = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=30, ...}) = 0
mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
0) = 0x7f67f3637000
read(3, "node74 /home/dnfs /tmp/iozone\n", 4096) = 30
read(3, "", 4096)                       = 0
close(3)                                = 0
munmap(0x7f67f3637000, 4096)            = 0
write(1, "\tIozone: Performance Test of Fil"..., 38     Iozone:
Performance Test of File I/O
) = 38
write(1, "\t        Version $Revision: 3.24"..., 64
Version $Revision: 3.242 $
                Compiled for 64 bit mode.
) = 64
write(1, "\t\tBuild: linux-AMD64 \n\n", 23              Build: linux-AMD64

) = 23
write(1, "\tContributors:William Norcott, D"..., 71
Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
) = 71
write(1, "\t             Al Slater, Scott R"..., 60
Al Slater, Scott Rhine, Mike Wisner, Ken Goss
) = 60
write(1, "\t             Steve Landherr, Br"..., 69
Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
) = 69
write(1, "\t             Randy Dunlap, Mark"..., 57
Randy Dunlap, Mark Montague, Dan Million,
) = 57
write(1, "\t             Jean-Marc Zucconi,"..., 48
Jean-Marc Zucconi, Jeff Blomberg,
) = 48
write(1, "\t             Erik Habbinga, Kri"..., 58
Erik Habbinga, Kris Strecker, Walter Wong.

) = 58
write(1, "\tRun began: Thu Jan  8 23:12:20 "..., 38     Run began: Thu
Jan  8 23:12:20 2009

) = 38
write(1, "\tFile size set to 1024 KB\n", 26     File size set to 1024 KB
) = 26
write(1, "\tExcel chart generation enabled\n", 32       Excel chart
generation enabled
) = 32
write(1, "\tNetwork distribution mode enabl"..., 36     Network
distribution mode enabled.
) = 36
write(1, "\tCommand line used:", 19     Command line used    = 19
write(1, " iozone", 7 iozone)                  = 7
write(1, " -s", 3 -s)                      = 3
write(1, " 1m", 3 1m)                      = 3
write(1, " -Rb", 4 -Rb)                     = 4
write(1, " log.xls", 8 log.xls)                 = 8
write(1, " -t", 3 -t)                      = 3
write(1, " 1", 2 1)                       = 2
write(1, " -+m", 4 -+m)                     = 4
write(1, " clientlist", 11 clientlist)             = 11
write(1, "\n", 1
)                       = 1
write(1, "\tOutput is in Kbytes/sec", 24        Output is in Kbytes/sec) = 24
write(1, "\n", 1
)                       = 1
write(1, "\tTime Resolution = 0.000001 seco"..., 37     Time
Resolution = 0.000001 seconds.
) = 37
write(1, "\tProcessor cache size set to 102"..., 42     Processor
cache size set to 1024 Kbytes.
) = 42
write(1, "\tProcessor cache line size set t"..., 44     Processor
cache line size set to 32 bytes.
) = 44
write(1, "\tFile stride size set to 17 * re"..., 43     File stride
size set to 17 * record size.
) = 43
write(1, "\tThroughput test with 1 process\n", 32       Throughput
test with 1 process
) = 32
shmget(IPC_PRIVATE, 16384, IPC_CREAT|0666) = 1376256
shmat(1376256, 0, 0)                    = ?
shmctl(1376256, IPC_RMID, 0)            = 0
write(1, "\tEach process writes a 1024 Kbyt"..., 58     Each process
writes a 1024 Kbyte file in 4 Kbyte records
) = 58
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
setsockopt(3, SOL_SOCKET, SO_RCVBUF, [262144], 4) = 0
bind(3, {sa_family=AF_INET, sin_port=htons(20000),
sin_addr=inet_addr("0.0.0.0")}, 16) = 0
rt_sigaction(SIGINT, {SIG_IGN}, {0x408b2c, [INT],
SA_RESTORER|SA_RESTART, 0x7f67f2ced430}, 8) = 0
rt_sigaction(SIGQUIT, {SIG_IGN}, {SIG_DFL}, 8) = 0
rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD,
parent_tidptr=0x7ffffb6398b8) = 31887
wait4(31887, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 31887
rt_sigaction(SIGINT, {0x408b2c, [INT], SA_RESTORER|SA_RESTART,
0x7f67f2ced430}, NULL, 8) = 0
rt_sigaction(SIGQUIT, {SIG_DFL}, NULL, 8) = 0
rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
--- SIGCHLD (Child exited) @ 0 (0) ---
recvfrom(3,
/******************************************************/

I am not familiar with strace but I guess iozone was watiing to
receive something from node74.

Anyone ever use iozone across multiple nodes?
Any help will be very appreciated!

PS:
I can run iozone to test local fs without any problem.

--
wcw

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Iozone on multiple nodes using ssh

t35t0r
Not on gentoo but on RHEL. Run this script on master host. It will run
iozone for small to large file sizes (up to 1G).

---- cut ----

#!/bin/sh

# full path to iozone executable
iozoneExec="/home/somewhere/bin/iozone"
# full path to the host file
hostFile="/home/somewhere/iozoneHostsAwesomeFS.txt"
# the number of hosts in the host file
numHosts="16"
# the record size or block size to use in all transfers
recordSize="32k"
# the file output prefix (the final file name will be
${filePrefix}-${recordSize}:${fs}k.xls or .out depending on whether it
is the excel file or log file, where ${fs} is the file size for the
transfer in KB (incrementally generated by the loop)).
filePrefix="cluster-awesomeFS"

# file sizes from 32k to 1g increasing by powers of 2
export rsh="ssh"
export RSH="ssh"
for ((i=5;i<=20;i+=1)); do
        fs=$((2**$i))
        $iozoneExec -RcTM -t $numHosts -b
${filePrefix}-${recordSize}:${fs}k.xls -r $recordSize -s ${fs}k -+m
$hostFile > ${filePrefix}-${recordSize}:${fs}k.out
done

---- cut ----

---- cut ----

sample hosts file:

# hostname   directoryForTestFiles   pathToIozoneOnEachHost
node1 /net/NAS/awesomeFS /home/somewhere/bin/iozone
node2 /net/NAS/awesomeFS /home/somewhere/bin/iozone
node3 /net/NAS/awesomeFS /home/somewhere/bin/iozone

---- cut ----

On Thu, Jan 8, 2009 at 9:12 AM, Chuanwen Wu <[hidden email]> wrote:
> Hi,
> I am trying to use iozone to test my cluster with ssh but not rsh, but
> I still can't running iozone on multiple nodes.

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Iozone on multiple nodes using ssh

Chuanwen Wu
In reply to this post by Chuanwen Wu
Hi, thank t35t0r!
I have tried your script, but still got the same problem.

> Then I run iozone on node73:
> /***********************************************************/
> dnfs@node73 ~ $ iozone -s 1m -Rb log.xls  -t 1 -+m clientlist
>        Iozone: Performance Test of File I/O
>                Version $Revision: 3.242 $
>                Compiled for 64 bit mode.
>                Build: linux-AMD64
[...]

>        Run began: Thu Jan  8 23:11:13 2009
>
>        File size set to 1024 KB
>        Excel chart generation enabled
>        Network distribution mode enabled.
>        Command line used: iozone -s 1m -Rb log.xls -t 1 -+m clientlist
>        Output is in Kbytes/sec
>        Time Resolution = 0.000001 seconds.
>        Processor cache size set to 1024 Kbytes.
>        Processor cache line size set to 32 bytes.
>        File stride size set to 17 * record size.
>        Throughput test with 1 process
>        Each process writes a 1024 Kbyte file in 4 Kbyte records
> /******************************************************/
> Then iozone stoped here.
>
> I still use strace to see what happend:
> /******************************************************/
> dnfs@node73 ~ $ strace iozone -s 1m -Rb log.xls  -t 1 -+m clientlist
> execve("/usr/bin/iozone", ["iozone", "-s", "1m", "-Rb", "log.xls",
> "-t", "1", "-+m", "clientlist"], [/* 45 vars */]) = 0
> brk(0)                                  = 0x7cb000
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x7f67f3639000
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x7f67f3638000
> access("/etc/ld.so.preload", R_OK)      = -1 ENOENT (No such file or directory)
> open("/etc/ld.so.cache", O_RDONLY)      = 3
> fstat(3, {st_mode=S_IFREG|0644, st_size=48606, ...}) = 0
> mmap(NULL, 48606, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7f67f362c000
> close(3)                                = 0
> open("/lib/librt.so.1", O_RDONLY)       = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\300\"\0\0\0\0\0\0"...,
> 832) = 832
> fstat(3, {st_mode=S_IFREG|0755, st_size=35688, ...}) = 0
> mmap(NULL, 2132968, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
> 0) = 0x7f67f3217000
> mprotect(0x7f67f321f000, 2093056, PROT_NONE) = 0
> mmap(0x7f67f341e000, 8192, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7000) = 0x7f67f341e000
> close(3)                                = 0
> open("/lib/libpthread.so.0", O_RDONLY)  = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\240W\0\0\0\0\0\0"...,
> 832) = 832
> fstat(3, {st_mode=S_IFREG|0755, st_size=131577, ...}) = 0
> mmap(NULL, 2204528, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
> 0) = 0x7f67f2ffc000
> mprotect(0x7f67f3011000, 2097152, PROT_NONE) = 0
> mmap(0x7f67f3211000, 8192, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x15000) = 0x7f67f3211000
> mmap(0x7f67f3213000, 13168, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f67f3213000
> close(3)                                = 0
> open("/lib/libc.so.6", O_RDONLY)        = 3
> read(3, "\177ELF\2\1\1\0\0\0\0\0\0\0\0\0\3\0>\0\1\0\0\0\220\334\1\0\0\0\0\0"...,
> 832) = 832
> fstat(3, {st_mode=S_IFREG|0755, st_size=1293456, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x7f67f362b000
> mmap(NULL, 3399928, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3,
> 0) = 0x7f67f2cbd000
> mprotect(0x7f67f2df3000, 2093056, PROT_NONE) = 0
> mmap(0x7f67f2ff2000, 20480, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x135000) = 0x7f67f2ff2000
> mmap(0x7f67f2ff7000, 16632, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x7f67f2ff7000
> close(3)                                = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x7f67f362a000
> arch_prctl(ARCH_SET_FS, 0x7f67f362a6f0) = 0
> mprotect(0x7f67f2ff2000, 16384, PROT_READ) = 0
> mprotect(0x7f67f3211000, 4096, PROT_READ) = 0
> mprotect(0x7f67f341e000, 4096, PROT_READ) = 0
> mprotect(0x62a000, 4096, PROT_READ)     = 0
> mprotect(0x7f67f363a000, 4096, PROT_READ) = 0
> munmap(0x7f67f362c000, 48606)           = 0
> set_tid_address(0x7f67f362a780)         = 31886
> set_robust_list(0x7f67f362a790, 0x18)   = 0
> rt_sigaction(SIGRTMIN, {0x7f67f3001310, [], SA_RESTORER|SA_SIGINFO,
> 0x7f67f3009ec0}, NULL, 8) = 0
> rt_sigaction(SIGRT_1, {0x7f67f3001390, [],
> SA_RESTORER|SA_RESTART|SA_SIGINFO, 0x7f67f3009ec0}, NULL, 8) = 0
> rt_sigprocmask(SIG_UNBLOCK, [RTMIN RT_1], NULL, 8) = 0
> getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0
> uname({sys="Linux", node="Gentoo-F312-73", ...}) = 0
> brk(0)                                  = 0x7cb000
> brk(0x7ec000)                           = 0x7ec000
> open("/etc/localtime", O_RDONLY)        = 3
> fstat(3, {st_mode=S_IFREG|0644, st_size=405, ...}) = 0
> fstat(3, {st_mode=S_IFREG|0644, st_size=405, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x7f67f3637000
> read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0\3\0\0\0\0"...,
> 4096) = 405
> lseek(3, -240, SEEK_CUR)                = 165
> read(3, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0\3\0\0\0\0"...,
> 4096) = 240
> close(3)                                = 0
> munmap(0x7f67f3637000, 4096)            = 0
> rt_sigaction(SIGINT, {0x408b2c, [INT], SA_RESTORER|SA_RESTART,
> 0x7f67f2ced430}, {SIG_DFL}, 8) = 0
> rt_sigaction(SIGTERM, {0x408b2c, [TERM], SA_RESTORER|SA_RESTART,
> 0x7f67f2ced430}, {SIG_DFL}, 8) = 0
> mmap(NULL, 18878464, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS,
> -1, 0) = 0x7f67f1abc000
> open("clientlist", O_RDONLY)            = 3
> fstat(3, {st_mode=S_IFREG|0644, st_size=30, ...}) = 0
> mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1,
> 0) = 0x7f67f3637000
> read(3, "node74 /home/dnfs /tmp/iozone\n", 4096) = 30
> read(3, "", 4096)                       = 0
> close(3)                                = 0
> munmap(0x7f67f3637000, 4096)            = 0
> write(1, "\tIozone: Performance Test of Fil"..., 38     Iozone:
> Performance Test of File I/O
> ) = 38
> write(1, "\t        Version $Revision: 3.24"..., 64
> Version $Revision: 3.242 $
>                Compiled for 64 bit mode.
> ) = 64
> write(1, "\t\tBuild: linux-AMD64 \n\n", 23              Build: linux-AMD64
>
> ) = 23
> write(1, "\tContributors:William Norcott, D"..., 71
> Contributors:William Norcott, Don Capps, Isom Crawford, Kirby Collins
> ) = 71
> write(1, "\t             Al Slater, Scott R"..., 60
> Al Slater, Scott Rhine, Mike Wisner, Ken Goss
> ) = 60
> write(1, "\t             Steve Landherr, Br"..., 69
> Steve Landherr, Brad Smith, Mark Kelly, Dr. Alain CYR,
> ) = 69
> write(1, "\t             Randy Dunlap, Mark"..., 57
> Randy Dunlap, Mark Montague, Dan Million,
> ) = 57
> write(1, "\t             Jean-Marc Zucconi,"..., 48
> Jean-Marc Zucconi, Jeff Blomberg,
> ) = 48
> write(1, "\t             Erik Habbinga, Kri"..., 58
> Erik Habbinga, Kris Strecker, Walter Wong.
>
> ) = 58
> write(1, "\tRun began: Thu Jan  8 23:12:20 "..., 38     Run began: Thu
> Jan  8 23:12:20 2009
>
> ) = 38
> write(1, "\tFile size set to 1024 KB\n", 26     File size set to 1024 KB
> ) = 26
> write(1, "\tExcel chart generation enabled\n", 32       Excel chart
> generation enabled
> ) = 32
> write(1, "\tNetwork distribution mode enabl"..., 36     Network
> distribution mode enabled.
> ) = 36
> write(1, "\tCommand line used:", 19     Command line used    = 19
> write(1, " iozone", 7 iozone)                  = 7
> write(1, " -s", 3 -s)                      = 3
> write(1, " 1m", 3 1m)                      = 3
> write(1, " -Rb", 4 -Rb)                     = 4
> write(1, " log.xls", 8 log.xls)                 = 8
> write(1, " -t", 3 -t)                      = 3
> write(1, " 1", 2 1)                       = 2
> write(1, " -+m", 4 -+m)                     = 4
> write(1, " clientlist", 11 clientlist)             = 11
> write(1, "\n", 1
> )                       = 1
> write(1, "\tOutput is in Kbytes/sec", 24        Output is in Kbytes/sec) = 24
> write(1, "\n", 1
> )                       = 1
> write(1, "\tTime Resolution = 0.000001 seco"..., 37     Time
> Resolution = 0.000001 seconds.
> ) = 37
> write(1, "\tProcessor cache size set to 102"..., 42     Processor
> cache size set to 1024 Kbytes.
> ) = 42
> write(1, "\tProcessor cache line size set t"..., 44     Processor
> cache line size set to 32 bytes.
> ) = 44
> write(1, "\tFile stride size set to 17 * re"..., 43     File stride
> size set to 17 * record size.
> ) = 43
> write(1, "\tThroughput test with 1 process\n", 32       Throughput
> test with 1 process
> ) = 32
> shmget(IPC_PRIVATE, 16384, IPC_CREAT|0666) = 1376256
> shmat(1376256, 0, 0)                    = ?
> shmctl(1376256, IPC_RMID, 0)            = 0
> write(1, "\tEach process writes a 1024 Kbyt"..., 58     Each process
> writes a 1024 Kbyte file in 4 Kbyte records
> ) = 58
> socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 3
> setsockopt(3, SOL_SOCKET, SO_RCVBUF, [262144], 4) = 0
> bind(3, {sa_family=AF_INET, sin_port=htons(20000),
> sin_addr=inet_addr("0.0.0.0")}, 16) = 0
> rt_sigaction(SIGINT, {SIG_IGN}, {0x408b2c, [INT],
> SA_RESTORER|SA_RESTART, 0x7f67f2ced430}, 8) = 0
> rt_sigaction(SIGQUIT, {SIG_IGN}, {SIG_DFL}, 8) = 0
> rt_sigprocmask(SIG_BLOCK, [CHLD], [], 8) = 0
> clone(child_stack=0, flags=CLONE_PARENT_SETTID|SIGCHLD,
> parent_tidptr=0x7ffffb6398b8) = 31887
> wait4(31887, [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 31887
> rt_sigaction(SIGINT, {0x408b2c, [INT], SA_RESTORER|SA_RESTART,
> 0x7f67f2ced430}, NULL, 8) = 0
> rt_sigaction(SIGQUIT, {SIG_DFL}, NULL, 8) = 0
> rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> --- SIGCHLD (Child exited) @ 0 (0) ---
> recvfrom(3,
> /******************************************************/

I still have the output of tcpdump when the command above executed:
/******************************************************/
node74 ~ # tcpdump host node73 -vv and not arp
tcpdump: listening on eth0, link-type EN10MB (Ethernet), capture size 68 bytes
23:37:17.048678 IP (tos 0x0, ttl 64, id 28484, offset 0, flags [DF],
proto TCP (6), length 60) node73.36691 > node74.ssh: S
1224275785:1224275785(0) win 5840 <mss 1460,sackOK,timestamp
415078413[|tcp]>
23:37:17.049350 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto
TCP (6), length 60) node74.ssh > node73.36691: S
2873253591:2873253591(0) ack 1224275786 win 5792 <mss
1460,sackOK,timestamp 414802723[|tcp]>
23:37:17.048719 IP (tos 0x0, ttl 64, id 28485, offset 0, flags [DF],
proto TCP (6), length 52) node73.36691 > node74.ssh: ., cksum 0xee0e
(correct), 1:1(0) ack 1 win 92 <nop,nop,timestamp 415078413 414802723>
23:37:17.054017 IP (tos 0x0, ttl 64, id 15867, offset 0, flags [DF],
proto TCP (6), length 72) node74.ssh > node73.36691: P 1:21(20) ack 1
win 91 <nop,nop,timestamp 414802724 415078413>
23:37:17.054077 IP (tos 0x0, ttl 64, id 28486, offset 0, flags [DF],
proto TCP (6), length 52) node73.36691 > node74.ssh: ., cksum 0xedf8
(correct), 1:1(0) ack 21 win 92 <nop,nop,timestamp 415078414
414802724>
23:37:17.054163 IP (tos 0x0, ttl 64, id 28487, offset 0, flags [DF],
proto TCP (6), length 72) node73.36691 > node74.ssh: P 1:21(20) ack 21
win 92 <nop,nop,timestamp 415078414 414802724>
23:37:17.054182 IP (tos 0x0, ttl 64, id 15868, offset 0, flags [DF],
proto TCP (6), length 52) node74.ssh > node73.36691: ., cksum 0xede5
(correct), 21:21(0) ack 21 win 91 <nop,nop,timestamp 414802724
415078414>
23:37:17.054387 IP (tos 0x0, ttl 64, id 28488, offset 0, flags [DF],
proto TCP (6), length 844) node73.36691 > node74.ssh: P 21:813(792)
ack 21 win 92 <nop,nop,timestamp 415078414 414802724>
23:37:17.054403 IP (tos 0x0, ttl 64, id 15869, offset 0, flags [DF],
proto TCP (6), length 52) node74.ssh > node73.36691: ., cksum 0xeab4
(correct), 21:21(0) ack 813 win 116 <nop,nop,timestamp 414802724
415078414>
23:37:17.055050 IP (tos 0x0, ttl 64, id 15870, offset 0, flags [DF],
proto TCP (6), length 836) node74.ssh > node73.36691: P 21:805(784)
ack 813 win 116 <nop,nop,timestamp 414802725 415078414>
23:37:17.055274 IP (tos 0x0, ttl 64, id 28489, offset 0, flags [DF],
proto TCP (6), length 76) node73.36691 > node74.ssh: P 813:837(24) ack
805 win 116 <nop,nop,timestamp 415078414 414802725>
23:37:17.057803 IP (tos 0x0, ttl 64, id 15871, offset 0, flags [DF],
proto TCP (6), length 204) node74.ssh > node73.36691: P 805:957(152)
ack 837 win 116 <nop,nop,timestamp 414802725 415078414>
23:37:17.059087 IP (tos 0x0, ttl 64, id 28490, offset 0, flags [DF],
proto TCP (6), length 196) node73.36691 > node74.ssh: P 837:981(144)
ack 957 win 141 <nop,nop,timestamp 415078415 414802725>
23:37:17.068091 IP (tos 0x0, ttl 64, id 15872, offset 0, flags [DF],
proto TCP (6), length 772) node74.ssh > node73.36691: P 957:1677(720)
ack 981 win 140 <nop,nop,timestamp 414802728 415078415>
23:37:17.069761 IP (tos 0x0, ttl 64, id 28491, offset 0, flags [DF],
proto TCP (6), length 68) node73.36691 > node74.ssh: P 981:997(16) ack
1677 win 165 <nop,nop,timestamp 415078418 414802728>
23:37:17.107120 IP (tos 0x0, ttl 64, id 15873, offset 0, flags [DF],
proto TCP (6), length 52) node74.ssh > node73.36691: ., cksum 0xe35a
(correct), 1677:1677(0) ack 997 win 140 <nop,nop,timestamp 414802738
415078418>
23:37:17.107161 IP (tos 0x0, ttl 64, id 28492, offset 0, flags [DF],
proto TCP (6), length 100) node73.36691 > node74.ssh: P 997:1045(48)
ack 1677 win 165 <nop,nop,timestamp 415078427 414802738>
23:37:17.107169 IP (tos 0x0, ttl 64, id 15874, offset 0, flags [DF],
proto TCP (6), length 52) node74.ssh > node73.36691: ., cksum 0xe321
(correct), 1677:1677(0) ack 1045 win 140 <nop,nop,timestamp 414802738
415078427>
23:37:17.107206 IP (tos 0x0, ttl 64, id 15875, offset 0, flags [DF],
proto TCP (6), length 100) node74.ssh > node73.36691: P 1677:1725(48)
ack 1045 win 140 <nop,nop,timestamp 414802738 415078427>
23:37:17.107383 IP (tos 0x0, ttl 64, id 28493, offset 0, flags [DF],
proto TCP (6), length 116) node73.36691 > node74.ssh: P 1045:1109(64)
ack 1725 win 165 <nop,nop,timestamp 415078427 414802738>
23:37:17.109273 IP (tos 0x0, ttl 64, id 15876, offset 0, flags [DF],
proto TCP (6), length 132) node74.ssh > node73.36691: P 1725:1805(80)
ack 1109 win 140 <nop,nop,timestamp 414802738 415078427>
23:37:17.109407 IP (tos 0x0, ttl 64, id 28494, offset 0, flags [DF],
proto TCP (6), length 420) node73.36691 > node74.ssh: P 1109:1477(368)
ack 1805 win 165 <nop,nop,timestamp 415078428 414802738>
23:37:17.109920 IP (tos 0x0, ttl 64, id 15877, offset 0, flags [DF],
proto TCP (6), length 372) node74.ssh > node73.36691: P 1805:2125(320)
ack 1477 win 165 <nop,nop,timestamp 414802738 415078428>
23:37:17.119287 IP (tos 0x0, ttl 64, id 28495, offset 0, flags [DF],
proto TCP (6), length 692) node73.36691 > node74.ssh: P 1477:2117(640)
ack 2125 win 190 <nop,nop,timestamp 415078430 414802738>
23:37:17.119941 IP (tos 0x0, ttl 64, id 15878, offset 0, flags [DF],
proto TCP (6), length 84) node74.ssh > node73.36691: P 2125:2157(32)
ack 2117 win 190 <nop,nop,timestamp 414802741 415078430>
23:37:17.120157 IP (tos 0x0, ttl 64, id 28496, offset 0, flags [DF],
proto TCP (6), length 116) node73.36691 > node74.ssh: P 2117:2181(64)
ack 2157 win 190 <nop,nop,timestamp 415078431 414802741>
23:37:17.121928 IP (tos 0x0, ttl 64, id 15879, offset 0, flags [DF],
proto TCP (6), length 100) node74.ssh > node73.36691: P 2157:2205(48)
ack 2181 win 190 <nop,nop,timestamp 414802741 415078431>
23:37:17.122090 IP (tos 0x8, ttl 64, id 28497, offset 0, flags [DF],
proto TCP (6), length 148) node73.36691 > node74.ssh: P 2181:2277(96)
ack 2205 win 190 <nop,nop,timestamp 415078431 414802741>
23:37:17.124942 IP (tos 0x8, ttl 64, id 15880, offset 0, flags [DF],
proto TCP (6), length 100) node74.ssh > node73.36691: P 2205:2253(48)
ack 2277 win 190 <nop,nop,timestamp 414802742 415078431>
23:37:17.125095 IP (tos 0x8, ttl 64, id 28498, offset 0, flags [DF],
proto TCP (6), length 84) node73.36691 > node74.ssh: P 2277:2309(32)
ack 2253 win 190 <nop,nop,timestamp 415078432 414802742>
23:37:17.127056 IP (tos 0x8, ttl 64, id 15881, offset 0, flags [DF],
proto TCP (6), length 180) node74.ssh > node73.36691: P 2253:2381(128)
ack 2309 win 190 <nop,nop,timestamp 414802742 415078432>
23:37:17.127219 IP (tos 0x8, ttl 64, id 28499, offset 0, flags [DF],
proto TCP (6), length 84) node73.36691 > node74.ssh: P 2309:2341(32)
ack 2381 win 214 <nop,nop,timestamp 415078432 414802742>
23:37:17.127340 IP (tos 0x8, ttl 64, id 28500, offset 0, flags [DF],
proto TCP (6), length 52) node73.36691 > node74.ssh: F, cksum 0xdafd
(correct), 2341:2341(0) ack 2381 win 214 <nop,nop,timestamp 415078432
414802742>
23:37:17.127902 IP (tos 0x8, ttl 64, id 15882, offset 0, flags [DF],
proto TCP (6), length 52) node74.ssh > node73.36691: F, cksum 0xdb13
(correct), 2381:2381(0) ack 2342 win 190 <nop,nop,timestamp 414802743
415078432>
23:37:17.127954 IP (tos 0x8, ttl 64, id 28501, offset 0, flags [DF],
proto TCP (6), length 52) node73.36691 > node74.ssh: ., cksum 0xdafb
(correct), 2342:2342(0) ack 2382 win 214 <nop,nop,timestamp 415078432
414802743>
/************************************************/

I have tried iozone3_242 and iozone3_303 on gentoo and RHEL4, all the
result is similar. I guess maybe process I configed and ran iozone is
incorrect.
Anybody have the details of how to run iozone across multiple nodes?
--
wcw

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Iozone on multiple nodes using ssh

Bugzilla from kyron@neuralbs.com
In reply to this post by t35t0r

t35t0r wrote:
Not on gentoo but on RHEL. 
I see nothing that is RHEL specific below...
Run this script on master host. It will run
iozone for small to large file sizes (up to 1G).

---- cut ----

#!/bin/sh

# full path to iozone executable
iozoneExec="/home/somewhere/bin/iozone"
# full path to the host file
hostFile="/home/somewhere/iozoneHostsAwesomeFS.txt"
# the number of hosts in the host file
numHosts="16"
# the record size or block size to use in all transfers
recordSize="32k"
# the file output prefix (the final file name will be
${filePrefix}-${recordSize}:${fs}k.xls or .out depending on whether it
is the excel file or log file, where ${fs} is the file size for the
transfer in KB (incrementally generated by the loop)).
filePrefix="cluster-awesomeFS"

# file sizes from 32k to 1g increasing by powers of 2
export rsh="ssh"
export RSH="ssh"
for ((i=5;i<=20;i+=1)); do
        fs=$((2**$i))
        $iozoneExec -RcTM -t $numHosts -b
${filePrefix}-${recordSize}:${fs}k.xls -r $recordSize -s ${fs}k -+m
$hostFile > ${filePrefix}-${recordSize}:${fs}k.out
done

---- cut ----

---- cut ----

sample hosts file:

# hostname   directoryForTestFiles   pathToIozoneOnEachHost
node1 /net/NAS/awesomeFS /home/somewhere/bin/iozone
node2 /net/NAS/awesomeFS /home/somewhere/bin/iozone
node3 /net/NAS/awesomeFS /home/somewhere/bin/iozone

---- cut ----

On Thu, Jan 8, 2009 at 9:12 AM, Chuanwen Wu [hidden email] wrote:
  
Hi,
I am trying to use iozone to test my cluster with ssh but not rsh, but
I still can't running iozone on multiple nodes.
    

  

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Iozone on multiple nodes using ssh

t35t0r
On Fri, Jan 9, 2009 at 8:30 PM, Eric Thibodeau <[hidden email]> wrote:
>
> t35t0r wrote:
>
> Not on gentoo but on RHEL.
>
> I see nothing that is RHEL specific below...

exactly, he's doing something wrong, probably has the binaries in the
wrong place, the clientlist he's using has 1 host too

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Iozone on multiple nodes using ssh

Chuanwen Wu
>
> exactly, he's doing something wrong, probably has the binaries in the
> wrong place, the clientlist he's using has 1 host too
Binaries? You mean the path of iozone?
I am very sure the path of iozone is correct, and if the path is
incorrect, I can see the error message:

dnfs@node73 ~ $ iozone -R -s 64k -t 1 -+m clientlist
   [...]
        Throughput test with 1 process
        Each process writes a 64 Kbyte file in 4 Kbyte records
bash: /tmp/iozone: No such file or directory


I alsa tried to use two host in clientlist, but the result is the same.

--
wcw

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Iozone on multiple nodes using ssh

t35t0r
> dnfs@node73 ~ $ iozone -R -s 64k -t 1 -+m clientlist
>   [...]
>        Throughput test with 1 process
>        Each process writes a 64 Kbyte file in 4 Kbyte records
> bash: /tmp/iozone: No such file or directory

# hostname   directoryForTestFiles   pathToIozoneOnEachHost
node1 /net/NAS/awesomeFS /home/somewhere/bin/iozone
node2 /net/NAS/awesomeFS /home/somewhere/bin/iozone
node3 /net/NAS/awesomeFS /home/somewhere/bin/iozone

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Iozone on multiple nodes using ssh

Chuanwen Wu
Hi, thanks!
On Sat, Jan 10, 2009 at 10:51 PM, t35t0r <[hidden email]> wrote:
>> dnfs@node73 ~ $ iozone -R -s 64k -t 1 -+m clientlist
>>   [...]
>>        Throughput test with 1 process
>>        Each process writes a 64 Kbyte file in 4 Kbyte records
I know what you mean. I just gave the example, in the case the path of
iozone is incorrect, what would happen, And the result is the error
below will be print out:
>> bash: /tmp/iozone: No such file or directory
I just tried to proved that the path of iozone in my nodes is correct.
>
> # hostname   directoryForTestFiles   pathToIozoneOnEachHost
> node1 /net/NAS/awesomeFS /home/somewhere/bin/iozone
> node2 /net/NAS/awesomeFS /home/somewhere/bin/iozone
> node3 /net/NAS/awesomeFS /home/somewhere/bin/iozone


Now, I almost gave up iozone. I have tried iometer, which is better at
testing parallel fs as I know. It's very easy to config and run.

--
wcw

Loading...