from k1lib.imports import *
2023-05-23 16:25:32,832 INFO worker.py:1364 -- Connecting to existing Ray cluster at address: 192.168.1.35:6379...
2023-05-23 16:25:32,836 INFO worker.py:1544 -- Connected to Ray cluster. View the dashboard at 127.0.0.1:8265
Picking up from 27-multi-node, tries to really see all the details behind applyCl and how fast it can be.
nodeId2Ip = applyCl.cmd("ifconfig | grep inet\ ") | apply(apply(op().strip().split(" ")[1]) | filt(op().startswith("192")) | item(), 1) | toDict()
nodeId2Cpu = None | applyCl.aS(lambda: applyCl.cpu()) | toDict()
nameTrans = None | applyCl.aS(lambda: applyCl.cpu()) | lookup(nodeId2Ip, 0) | apply(op().split(".")[-1], 0) | ~apply(lambda x,y: [x, f"{x} ({y} cores)"]) | toDict()
applyCl.cmd(f"lscpu && echo {'-'*30} && inxi -b && echo {'-'*30} && lspci && echo {'-'*30} && lsusb") | apply(op().replace("\x0312", "").replace("\x03", "").all(), 1) | lookup(nodeId2Ip, 0) | apply(fmt.h, 0, level=3) | apply(join("\n") | aS(fmt.pre), 1) | join("\n").all() | aS(viz.Carousel) | aS(lambda x: x._repr_html_() | viz.Scroll(500))
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 46 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 32 On-line CPU(s) list: 0-31 Vendor ID: GenuineIntel Model name: 13th Gen Intel(R) Core(TM) i9-13900K CPU family: 6 Model: 183 Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 1 Stepping: 1 CPU max MHz: 5800.0000 CPU min MHz: 800.0000 BogoMIPS: 5990.40 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb intel_pt sha_ni xsaveopt xsavec xgetbv1 xsaves split_lock_detect avx_vnni dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp hwp_pkg_req umip pku ospke waitpkg gfni vaes vpclmulqdq tme rdpid movdiri movdir64b fsrm md_clear serialize pconfig arch_lbr flush_l1d arch_capabilities Virtualization: VT-x L1d cache: 896 KiB (24 instances) L1i cache: 1.3 MiB (24 instances) L2 cache: 32 MiB (12 instances) L3 cache: 36 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0-31 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Not affected Vulnerability Retbleed: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected ------------------------------ System: Kernel 5.15.0-72-generic x86_64 bits 64 Console N/A Distro Linux Mint 21.1 Vera Machine: Type Desktop Mobo Micro-Star model PRO Z790-A WIFI (MS-7E07) v 2.0 serialUEFI American Megatrends LLC. v A.20 date 11/21/2022 CPU: Info 24-core (8-mt/16-st) 13th Gen Intel Core i9-13900K [MST AMCP] speed (MHz) avg 1090 min/max 800/5500:5800:4300 Graphics: Device-1 Intel driver N/A Device-2 NVIDIA GA102 [GeForce RTX 3080] driver nvidia v 530.41.03 Display server X.org v 1.21.1.4 with Xwayland v 22.1.1 driver X loaded nvidia unloaded fbdev,modesetting,nouveau,vesa gpu nvidia tty 80x40 Message GL data unavailable in console. Try -G --display Network: Device-1 Intel driver iwlwifi Device-2 Intel Ethernet I225-V driver igc Drives: Local Storage total 3.64 TiB used 80.4 GiB (2.2%) Info: Processes 462 Uptime 1d 14h 14m Memory 125.57 GiB used 20.06 GiB (16.0%) Init systemd runlevel 5 Client Unknown python3.9 client inxi 3.3.13 ------------------------------ 00:00.0 Host bridge: Intel Corporation Device a700 (rev 01) 00:02.0 Display controller: Intel Corporation Device a780 (rev 04) 00:08.0 System peripheral: Intel Corporation Device a74f (rev 01) 00:0a.0 Signal processing controller: Intel Corporation Device a77d (rev 01) 00:14.0 USB controller: Intel Corporation Device 7a60 (rev 11) 00:14.2 RAM memory: Intel Corporation Device 7a27 (rev 11) 00:14.3 Network controller: Intel Corporation Device 7a70 (rev 11) 00:16.0 Communication controller: Intel Corporation Device 7a68 (rev 11) 00:17.0 SATA controller: Intel Corporation Device 7a62 (rev 11) 00:1b.0 PCI bridge: Intel Corporation Device 7a40 (rev 11) 00:1b.4 PCI bridge: Intel Corporation Device 7a44 (rev 11) 00:1c.0 PCI bridge: Intel Corporation Device 7a38 (rev 11) 00:1c.3 PCI bridge: Intel Corporation Device 7a3b (rev 11) 00:1f.0 ISA bridge: Intel Corporation Device 7a04 (rev 11) 00:1f.3 Audio device: Intel Corporation Device 7a50 (rev 11) 00:1f.4 SMBus: Intel Corporation Device 7a23 (rev 11) 00:1f.5 Serial bus controller: Intel Corporation Device 7a24 (rev 11) 02:00.0 Non-Volatile memory controller: Micron/Crucial Technology P2 NVMe PCIe SSD (rev 01) 03:00.0 VGA compatible controller: NVIDIA Corporation GA102 [GeForce RTX 3080] (rev a1) 03:00.1 Audio device: NVIDIA Corporation GA102 High Definition Audio Controller (rev a1) 04:00.0 Ethernet controller: Intel Corporation Ethernet Controller I225-V (rev 03) ------------------------------ Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 003: ID 0db0:d1d7 Micro Star International USB Audio Bus 001 Device 002: ID 1462:7e07 Micro Star International MYSTIC LIGHT Bus 001 Device 008: ID 8087:0033 Intel Corp. Bus 001 Device 006: ID 05e3:0608 Genesys Logic, Inc. Hub Bus 001 Device 007: ID 1c4f:0002 SiGma Micro Keyboard TRACER Gamma Ivory Bus 001 Device 005: ID 05e3:0608 Genesys Logic, Inc. Hub Bus 001 Device 004: ID 28bd:0913 XP-Pen 4 inch PenTablet Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 16 On-line CPU(s) list: 0-15 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i7-10700K CPU @ 3.80GHz CPU family: 6 Model: 165 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 1 Stepping: 5 CPU max MHz: 5100.0000 CPU min MHz: 800.0000 BogoMIPS: 7599.80 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp pku ospke md_clear flush_l1d arch_capabilities Virtualization: VT-x L1d cache: 256 KiB (8 instances) L1i cache: 256 KiB (8 instances) L2 cache: 2 MiB (8 instances) L3 cache: 16 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0-15 Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable Vulnerability Retbleed: Mitigation; Enhanced IBRS Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence Vulnerability Srbds: Mitigation; Microcode Vulnerability Tsx async abort: Not affected ------------------------------ System: Kernel 5.15.0-71-generic x86_64 bits 64 Console N/A Distro Linux Mint 21.1 Vera Machine: Type Desktop Mobo Gigabyte model Z490 UD v x.x serialUEFI American Megatrends v F5 date 08/28/2020 CPU: Info 8-core Intel Core i7-10700K [MT MCP] speed (MHz) avg 4700 min/max 800/5100 Graphics: Device-1 NVIDIA GA102 [GeForce RTX 3090] driver nvidia v 470.182.03 Device-2 Microdia Webcam Vitade AF type USB driver snd-usb-audio,uvcvideo Display server X.org v 1.21.1.4 with Xwayland v 22.1.1 driver X loaded nvidia gpu nvidia tty 80x40 Message GL data unavailable in console. Try -G --display Network: Device-1 Intel Wi-Fi 6 AX200 driver iwlwifi Device-2 Realtek RTL8111/8168/8411 PCI Express Gigabit Ethernet driver r8169 Drives: Local Storage total 15.46 TiB used 8.22 TiB (53.2%) Info: Processes 1048 Uptime 25d 11h 6m Memory 62.73 GiB used 41.24 GiB (65.8%) Init systemd runlevel 5 Client Unknown python3.9 client inxi 3.3.13 ------------------------------ 00:00.0 Host bridge: Intel Corporation Device 9b43 (rev 05) 00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 05) 00:12.0 Signal processing controller: Intel Corporation Comet Lake PCH Thermal Controller 00:14.0 USB controller: Intel Corporation Comet Lake USB 3.1 xHCI Host Controller 00:14.2 RAM memory: Intel Corporation Comet Lake PCH Shared SRAM 00:16.0 Communication controller: Intel Corporation Comet Lake HECI Controller 00:17.0 SATA controller: Intel Corporation Comet Lake SATA AHCI Controller 00:1b.0 PCI bridge: Intel Corporation Comet Lake PCI Express Root Port #17 (rev f0) 00:1b.2 PCI bridge: Intel Corporation Device 06c2 (rev f0) 00:1b.4 PCI bridge: Intel Corporation Comet Lake PCI Express Root Port #21 (rev f0) 00:1c.0 PCI bridge: Intel Corporation Device 06b8 (rev f0) 00:1c.3 PCI bridge: Intel Corporation Device 06bb (rev f0) 00:1c.4 PCI bridge: Intel Corporation Device 06bc (rev f0) 00:1d.0 PCI bridge: Intel Corporation Comet Lake PCI Express Root Port #9 (rev f0) 00:1d.4 PCI bridge: Intel Corporation Device 06b4 (rev f0) 00:1f.0 ISA bridge: Intel Corporation Device 0685 00:1f.3 Audio device: Intel Corporation Comet Lake PCH cAVS 00:1f.4 SMBus: Intel Corporation Comet Lake PCH SMBus Controller 00:1f.5 Serial bus controller: Intel Corporation Comet Lake PCH SPI Controller 01:00.0 VGA compatible controller: NVIDIA Corporation GA102 [GeForce RTX 3090] (rev a1) 01:00.1 Audio device: NVIDIA Corporation GA102 High Definition Audio Controller (rev a1) 03:00.0 Network controller: Intel Corporation Wi-Fi 6 AX200 (rev 1a) 04:00.0 Non-Volatile memory controller: Sandisk Corp WD Black SN750 / PC SN730 NVMe SSD 06:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 16) 07:00.0 Non-Volatile memory controller: Micron/Crucial Technology P2 NVMe PCIe SSD (rev 01) 08:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 ------------------------------ Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 008: ID 320f:7c1a S U Bus 001 Device 012: ID 0c76:161e JMTek, LLC. USB PnP Audio Device Bus 001 Device 006: ID 28bd:0913 XP-Pen 4 inch PenTablet Bus 001 Device 043: ID 0c45:6366 Microdia Webcam Vitade AF Bus 001 Device 011: ID 048d:5702 Integrated Technology Express, Inc. ITE Device Bus 001 Device 010: ID 8087:0029 Intel Corp. AX200 Bluetooth Bus 001 Device 009: ID 04d9:fc38 Holtek Semiconductor, Inc. Gaming Mouse [Redragon M602-RGB] Bus 001 Device 004: ID 046d:0a73 Logitech, Inc. BLAST Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz CPU family: 6 Model: 158 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 9 CPU max MHz: 4200.0000 CPU min MHz: 800.0000 BogoMIPS: 7200.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities Virtualization: VT-x L1d cache: 128 KiB (4 instances) L1i cache: 128 KiB (4 instances) L2 cache: 1 MiB (4 instances) L3 cache: 8 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0-7 Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable Vulnerability Meltdown: Mitigation; PTI Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable Vulnerability Retbleed: Mitigation; IBRS Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS Not affected Vulnerability Srbds: Mitigation; Microcode Vulnerability Tsx async abort: Mitigation; TSX disabled ------------------------------ System: Kernel 5.15.0-72-generic x86_64 bits 64 Console N/A Distro Linux Mint 21 Vanessa Machine: Type Desktop System Dell product OptiPlex 5050 v N/A serialMobo Dell model 0FDY5C v A00 serial UEFI Dell v 1.24.0 date 12/22/2022 CPU: Info quad core Intel Core i7-7700 [MT MCP] speed (MHz) avg 1543 min/max 800/4200 Graphics: Device-1 Intel HD Graphics 630 driver i915 v kernel Display server X.org v 1.21.1.4 with Xwayland v 22.1.1 driver X loaded modesetting unloaded fbdev,vesa gpu i915 tty 80x40 Message GL data unavailable in console. Try -G --display Network: Device-1 Intel Ethernet I219-V driver e1000e Device-2 Linksys WUSB6300 V2 type USB driver rtl88x2bu Drives: Local Storage total 238.47 GiB used 84.31 GiB (35.4%) Info: Processes 271 Uptime 2d 21h 12m Memory 15.49 GiB used 1.55 GiB (10.0%) Init systemd runlevel 5 Client Unknown python3.9 client inxi 3.3.13 ------------------------------ 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 05) 00:02.0 VGA compatible controller: Intel Corporation HD Graphics 630 (rev 04) 00:14.0 USB controller: Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller 00:14.2 Signal processing controller: Intel Corporation 200 Series PCH Thermal Subsystem 00:16.0 Communication controller: Intel Corporation 200 Series PCH CSME HECI #1 00:17.0 SATA controller: Intel Corporation 200 Series PCH SATA controller [AHCI mode] 00:1f.0 ISA bridge: Intel Corporation 200 Series PCH LPC Controller (Q270) 00:1f.2 Memory controller: Intel Corporation 200 Series/Z370 Chipset Family Power Management Controller 00:1f.3 Audio device: Intel Corporation 200 Series PCH HD Audio 00:1f.4 SMBus: Intel Corporation 200 Series/Z370 Chipset Family SMBus Controller 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (5) I219-V ------------------------------ Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 002: ID 13b1:0045 Linksys WUSB6300 V2 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Address sizes: 39 bits physical, 48 bits virtual Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Vendor ID: GenuineIntel Model name: Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz CPU family: 6 Model: 158 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Stepping: 9 CPU max MHz: 4200.0000 CPU min MHz: 800.0000 BogoMIPS: 7200.00 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d arch_capabilities Virtualization: VT-x L1d cache: 128 KiB (4 instances) L1i cache: 128 KiB (4 instances) L2 cache: 1 MiB (4 instances) L3 cache: 8 MiB (1 instance) NUMA node(s): 1 NUMA node0 CPU(s): 0-7 Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable Vulnerability Mds: Mitigation; Clear CPU buffers; SMT vulnerable Vulnerability Meltdown: Mitigation; PTI Vulnerability Mmio stale data: Mitigation; Clear CPU buffers; SMT vulnerable Vulnerability Retbleed: Mitigation; IBRS Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS Not affected Vulnerability Srbds: Mitigation; Microcode Vulnerability Tsx async abort: Mitigation; TSX disabled ------------------------------ System: Kernel 5.15.0-72-generic x86_64 bits 64 Console N/A Distro Linux Mint 21 Vanessa Machine: Type Desktop System Dell product OptiPlex 5050 v N/A serialMobo Dell model 0FDY5C v A00 serial UEFI Dell v 1.24.0 date 12/22/2022 CPU: Info quad core Intel Core i7-7700 [MT MCP] speed (MHz) avg 1569 min/max 800/4200 Graphics: Device-1 Intel HD Graphics 630 driver i915 v kernel Display server X.org v 1.21.1.4 with Xwayland v 22.1.1 driver X loaded modesetting unloaded fbdev,vesa gpu i915 tty 80x40 Message GL data unavailable in console. Try -G --display Network: Device-1 Intel Ethernet I219-V driver e1000e Drives: Local Storage total 953.87 GiB used 67 GiB (7.0%) Info: Processes 221 Uptime 2d 21h 10m Memory 31.22 GiB used 1.74 GiB (5.6%) Init systemd runlevel 5 Client Unknown python3.9 client inxi 3.3.13 ------------------------------ 00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers (rev 05) 00:02.0 VGA compatible controller: Intel Corporation HD Graphics 630 (rev 04) 00:14.0 USB controller: Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller 00:14.2 Signal processing controller: Intel Corporation 200 Series PCH Thermal Subsystem 00:16.0 Communication controller: Intel Corporation 200 Series PCH CSME HECI #1 00:17.0 SATA controller: Intel Corporation 200 Series PCH SATA controller [AHCI mode] 00:1f.0 ISA bridge: Intel Corporation 200 Series PCH LPC Controller (Q270) 00:1f.2 Memory controller: Intel Corporation 200 Series/Z370 Chipset Family Power Management Controller 00:1f.3 Audio device: Intel Corporation 200 Series PCH HD Audio 00:1f.4 SMBus: Intel Corporation 200 Series/Z370 Chipset Family SMBus Controller 00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (5) I219-V ------------------------------ Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
startServer = lambda: threading.Thread(target=lambda: applyCl.cmd("iperf -s", timeout=3600)).start()
stopServer = lambda: applyCl.cmd("pkill iperf", timeout=60)
guts = cut(1) | apply(rows(4, 6) | op().split(" ").all() | filt(op().startswith("192")) + tail(2) | joinStreams()) | toInt(2) | deref() | filt(op().startswith("Gbits"), 3).split() | apply(op()*1000, 2) + iden() | joinStreams() | cut()[:3]
#notest
with k1.timer() as t:
startServer(); data = [[applyCl.nodeIds(), nodeId2Ip, nodeId2Cpu, nameTrans]]
for nodeId in applyCl.nodeIds():
ip = nodeId2Ip[nodeId]; res = applyCl.nodeIds() | apply(lambda nId: [nId] | applyCl.aS(lambda: None | cmd(f"iperf -c {ip}") | deref(), timeout=60) | deref()) | deref()
data.append([ip, res])
data | aS(dill.dumps) | file("net-serial.pth"); stopServer()
t()
167.6787669658661
[nodeIds, nodeId2Ip, nodeId2Cpu, nameTrans], *runs = cat("net-serial.pth", False) | aS(dill.loads); runs | cut(1) | item(2)
[['51306d246cd47ada0addf519f27c0d313953d20aa3fe9e2080f2bb0d', ['------------------------------------------------------------', 'Client connecting to 192.168.1.57, TCP port 5001', 'TCP window size: 2.50 MByte (default)', '------------------------------------------------------------', '[ 1] local 192.168.1.57 port 55166 connected with 192.168.1.57 port 5001', '[ ID] Interval Transfer Bandwidth', '[ 1] 0.0000-10.0072 sec 144 GBytes 123 Gbits/sec']]]
refined = runs | cut(1) | apply(item().all() | guts) | joinStreams() | deref(); refined[:4]
[['192.168.1.57', '192.168.1.57', 123000], ['192.168.1.35', '192.168.1.57', 939], ['192.168.1.43', '192.168.1.57', 470], ['192.168.1.53', '192.168.1.57', 467]]
g = k1.digraph(); refined | apply(op().split(".")[-1], [0, 1]) | filt(~aS(lambda x,y,z: x!=y)) | lookup(nameTrans, [0, 1]) | ~apply(lambda x,y,z: g(x,y,label=f" {z}")) | deref(); g("client", "server"); g
Very interesting. Communications between the 16 cores and the 32 cores computers are always fast, at around 950 Mbits/sec, almost up to the gigabit range. However, for the 8 core computers, it's half that. So I'm guessing that the network card is just not fast enough? But it's weird that connections from 8 core computers to the higher core computers is asymmetric, as in, fetching speed is low, but serving speed is high. No idea what that's about. What if we test all network traffic at once, instead of testing it one at a time?
#notest
with k1.timer() as t:
startServer(); nodeIds = applyCl.nodeIds(); time.sleep(1)
a = nodeIds | apply(wrapList() | insert(nodeIds, False)) | ungroup() | lookup(nodeId2Ip, 1) | applyCl(lambda ip: None | cmd(f"iperf -c {ip}") | deref(), pre=True, timeout=300) | deref()
stopServer(); [[applyCl.nodeIds(), nodeId2Ip, nodeId2Cpu, nameTrans], *a] | aS(dill.dumps) | file("net-parallel.pth")
t()
13.722673654556274
# thumbnail
g = k1.digraph(); cat("net-parallel.pth", False) | aS(dill.loads) | op()[1:] | guts | apply(op().split(".")[-1], [0, 1]) | filt(~aS(lambda x,y,z: x!=y)) | lookup(nameTrans, [0, 1]) | ~apply(lambda x,y,z: g(x,y,label=f" {z}")) | deref(); g("client", "server"); g | toImg()
Interesting.
from k1lib.imports import *
base = "~/ssd2/test/balanceFile"; fn = f"{base}/a.txt"
nodeId2Cpu = None | applyCl.aS(lambda: applyCl.cpu()) | deref()
c32, [[c16], [c81, c82]] = nodeId2Cpu | filt(op() == 32, 1).split() | item(2) + (filt(op() == 16, 1).split() | item().all(2))
applyCl.cmd(f"rm -r {base}"); applyCl.cmd(f"mkdir -p {base}")
2023-05-22 13:38:28,373 INFO worker.py:1364 -- Connecting to existing Ray cluster at address: 192.168.1.35:6379...
2023-05-22 13:38:28,378 INFO worker.py:1544 -- Connected to Ray cluster. View the dashboard at 127.0.0.1:8265
[['51306d246cd47ada0addf519f27c0d313953d20aa3fe9e2080f2bb0d', []], ['244f4e340443f74ab302822376195329f45858b65998b4a77f5f4732', []], ['ef95788f939f2409d731d963c3ea9281aa1b42f3a091faebc42f65bc', []], ['bb09a2d34473af60b2274afc49b5b7f8ee11255bd7faf58b48c89dd1', []]]
#notest
with k1.timer() as t:
[c32, c16] | applyCl.aS(lambda: "0123456789\n" | repeat(300_000_000) | file(fn), timeout=600) | deref()
t()
48.86774706840515
applyCl.diskScan("~/ssd2/test")
------------------------------------------------------------ Replicated files ------------------------------------------------------------ Path Total size Size on each node (node id and thread count) 51306, 32 thr 244f4, 16 thr ef957, 8 thr bb09a, 8 thr ---------------------------------------- ---------- ------------ ------------ ------------ ------------ /home/kelvin/ssd2/test/balanceFile/a.txt 7.2 GB 3.6 GB 3.6 GB 0.0 B 0.0 B A replicated file is a file that has been copied to multiple nodes. Size of all file copies should be the same. It's managed by applyCl.replicateFile()
#notest
with k1.timer() as t: applyCl.balanceFile(fn)
t()
Transferring data to new nodes: 100% | 100%, 26s elapsed
26.39429473876953
applyCl.diskScan("~/ssd2/test")
------------------------------------------------------------ Distributed files ------------------------------------------------------------ Path Total size Size on each node (node id and thread count) 51306, 32 thr 244f4, 16 thr ef957, 8 thr bb09a, 8 thr ---------------------------------------- ---------- ------------ ------------ ------------ ------------ /home/kelvin/ssd2/test/balanceFile/a.txt 7.2 GB 3.6 GB 1.8 GB 900.0 MB 900.0 MB A distributed file is a file that has been split into multiple pieces and sent to other nodes. It's managed by applyCl.balanceFile()
1.8GB in 26s, or 69MB/s. Nice.
#notest
with k1.timer() as t: applyCl.decommissionFile(fn, [c32, c16])
t()
Decommissioning: 100% | 100%, 58s elapsed
58.361918449401855
applyCl.diskScan("~/ssd2/test")
------------------------------------------------------------ Replicated files ------------------------------------------------------------ Path Total size Size on each node (node id and thread count) 51306, 32 thr 244f4, 16 thr ef957, 8 thr bb09a, 8 thr ---------------------------------------- ---------- ------------ ------------ ------------ ------------ /home/kelvin/ssd2/test/balanceFile/a.txt 7.2 GB 0.0 B 0.0 B 3.6 GB 3.6 GB A replicated file is a file that has been copied to multiple nodes. Size of all file copies should be the same. It's managed by applyCl.replicateFile()
5.4GB in 58s, or 93GB/s
#notest
with k1.timer() as t: applyCl.balanceFile(fn, None, [c16, c81, c82])
t()
Transferring data to new nodes: 100%, 270s elapsed
270.2651343345642
applyCl.diskScan("~/ssd2/test")
------------------------------------------------------------ Distributed files ------------------------------------------------------------ Path Total size Size on each node (node id and thread count) 51306, 32 thr 244f4, 16 thr ef957, 8 thr bb09a, 8 thr ---------------------------------------- ---------- ------------ ------------ ------------ ------------ /home/kelvin/ssd2/test/balanceFile/a.txt 7.2 GB 0.0 B 3.6 GB 1.8 GB 1.8 GB A distributed file is a file that has been split into multiple pieces and sent to other nodes. It's managed by applyCl.balanceFile()
3.6GB in 270s, or 13MB/s. So pathetic.
#notest
with k1.timer() as t: applyCl.decommissionFile(fn, [c81, c82])
t()
Transferring data to new nodes: 100%, 59s elapsed
417.87030482292175
applyCl.diskScan("~/ssd2/test")
------------------------------------------------------------ Distributed files ------------------------------------------------------------ Path Total size Size on each node (node id and thread count) 51306, 32 thr 244f4, 16 thr ef957, 8 thr bb09a, 8 thr ---------------------------------------- ---------- ------------ ------------ ------------ ------------ /home/kelvin/ssd2/test/balanceFile/a.txt 7.2 GB 4.8 GB 2.4 GB 0.0 B 0.0 B A distributed file is a file that has been split into multiple pieces and sent to other nodes. It's managed by applyCl.balanceFile()
There're actually 2 steps:
#notest
with k1.timer() as t: applyCl.balanceFile(fn)
t()
Transferring data to new nodes: 100% | 100%, 17s elapsed
17.610517978668213
applyCl.diskScan("~/ssd2/test")
------------------------------------------------------------ Distributed files ------------------------------------------------------------ Path Total size Size on each node (node id and thread count) 51306, 32 thr 244f4, 16 thr ef957, 8 thr bb09a, 8 thr ---------------------------------------- ---------- ------------ ------------ ------------ ------------ /home/kelvin/ssd2/test/balanceFile/a.txt 7.2 GB 3.6 GB 1.8 GB 900.0 MB 900.0 MB A distributed file is a file that has been split into multiple pieces and sent to other nodes. It's managed by applyCl.balanceFile()
1.8GB in 17s, or 106MB/s
#notest
with k1.timer() as t: applyCl.decommissionFile(fn, [c81, c82])
t()
Decommissioning: 100% | 100%, 52s elapsed
52.10003304481506
applyCl.diskScan("~/ssd2/test")
------------------------------------------------------------ Distributed files ------------------------------------------------------------ Path Total size Size on each node (node id and thread count) 51306, 32 thr 244f4, 16 thr ef957, 8 thr bb09a, 8 thr ---------------------------------------- ---------- ------------ ------------ ------------ ------------ /home/kelvin/ssd2/test/balanceFile/a.txt 7.2 GB 4.8 GB 2.4 GB 0.0 B 0.0 B A distributed file is a file that has been split into multiple pieces and sent to other nodes. It's managed by applyCl.balanceFile()
1.8GB in 52s, or 34.6MB/s
Honestly no idea why the results are like this. It's just all over the place. There are times when decommisssioning is very slow, but other times it's normal. Most of time spreading data outwards is a lot faster, but sometimes it's only average. No clue what is happening underneath, and no idea why there's such an asymmetry.
from k1lib.imports import *
base = "~/ssd2/test/balanceFolder"
nodeId2Cpu = None | applyCl.aS(lambda: applyCl.cpu()) | deref()
c32, [[c16], [c81, c82]] = nodeId2Cpu | filt(op() == 32, 1).split() | item(2) + (filt(op() == 16, 1).split() | item().all(2))
applyCl.cmd(f"rm -r {base}"); applyCl.cmd(f"mkdir -p {base}")
content = k1.Wrapper("0123456789\n"*100)
2023-05-22 16:35:16,604 INFO worker.py:1364 -- Connecting to existing Ray cluster at address: 192.168.1.35:6379...
2023-05-22 16:35:16,609 INFO worker.py:1544 -- Connected to Ray cluster. View the dashboard at 127.0.0.1:8265
#notest
with k1.timer() as t:
np.linspace(10000, 100000, 100).astype(int) | insertIdColumn() | ~apply(lambda idx, x: content() | repeat(x) | file(f"{base}/{idx}.txt")) | ignore()
t()
5.0402116775512695
applyCl.diskScan("~/ssd2/test")
------------------------------------------------------------ Distributed folders ------------------------------------------------------------ Path Total size Size on each node (node id and thread count) 51306, 32 thr 244f4, 16 thr ef957, 8 thr bb09a, 8 thr ---------------------------------------- ---------- ------------ ------------ ------------ ------------ /home/kelvin/ssd2/test/balanceFolder 6.06 GB 0.0 B 6.06 GB 0.0 B 0.0 B A distributed folder is a folder that has many files and folders inside, but their names are all different from each other. It's managed by applyCl.balanceFolder()
#notest
with k1.timer() as t, k1.captureStdout(False): applyCl.balanceFolder(base)
t()
57.85381722450256
applyCl.diskScan("~/ssd2/test")
------------------------------------------------------------ Distributed folders ------------------------------------------------------------ Path Total size Size on each node (node id and thread count) 51306, 32 thr 244f4, 16 thr ef957, 8 thr bb09a, 8 thr ---------------------------------------- ---------- ------------ ------------ ------------ ------------ /home/kelvin/ssd2/test/balanceFolder 6.06 GB 3.03 GB 1.52 GB 750.67 MB 759.69 MB A distributed folder is a folder that has many files and folders inside, but their names are all different from each other. It's managed by applyCl.balanceFolder()
4.5GB in 58s, or 78MB/s. Pretty good actually.
#notest
with k1.timer() as t, k1.captureStdout(False): applyCl.decommissionFolder(base, [c32, c16])
t()
72.67098760604858
applyCl.diskScan("~/ssd2/test")
------------------------------------------------------------ Distributed folders ------------------------------------------------------------ Path Total size Size on each node (node id and thread count) 51306, 32 thr 244f4, 16 thr ef957, 8 thr bb09a, 8 thr ---------------------------------------- ---------- ------------ ------------ ------------ ------------ /home/kelvin/ssd2/test/balanceFolder 6.06 GB 0.0 B 0.0 B 3.13 GB 2.92 GB A distributed folder is a folder that has many files and folders inside, but their names are all different from each other. It's managed by applyCl.balanceFolder()
4.6GB in 72s, or 63MB/s. Still about the same speed.
#notest
with k1.timer() as t, k1.captureStdout(False): applyCl.balanceFolder(base)
t()
55.68014645576477
applyCl.diskScan("~/ssd2/test")
------------------------------------------------------------ Distributed folders ------------------------------------------------------------ Path Total size Size on each node (node id and thread count) 51306, 32 thr 244f4, 16 thr ef957, 8 thr bb09a, 8 thr ---------------------------------------- ---------- ------------ ------------ ------------ ------------ /home/kelvin/ssd2/test/balanceFolder 6.06 GB 2.99 GB 1.53 GB 738.67 MB 799.72 MB A distributed folder is a folder that has many files and folders inside, but their names are all different from each other. It's managed by applyCl.balanceFolder()
4.52GB in 56s, or 80.7MB/s
base2 = "~/ssd2/test/dillPerf"; applyCl.cmd(f"rm -rf {base2}"); applyCl.cmd(f"mkdir -p {base2}")
[['51306d246cd47ada0addf519f27c0d313953d20aa3fe9e2080f2bb0d', []], ['244f4e340443f74ab302822376195329f45858b65998b4a77f5f4732', []], ['ef95788f939f2409d731d963c3ea9281aa1b42f3a091faebc42f65bc', []], ['bb09a2d34473af60b2274afc49b5b7f8ee11255bd7faf58b48c89dd1', []]]
#notest
with k1.timer() as t: None | applyCl.aS(lambda: ls(base)) | ungroup() | applyCl(lambda fn: cat(fn) | aS(list) | aS(dill.dumps) | file(f"{base2}/{fn.split('/')[-1]}"), pre=True) | deref()
t()
(raylet, ip=192.168.1.43) [2023-05-22 16:50:18,469 E 3682 3682] (raylet) node_manager.cc:3040: 1 Workers (tasks / actors) killed due to memory pressure (OOM), 0 Workers crashed due to other reasons at node (ID: ef95788f939f2409d731d963c3ea9281aa1b42f3a091faebc42f65bc, IP: 192.168.1.43) over the last time period. To see more information about the Workers killed on this node, use `ray logs raylet.out -ip 192.168.1.43` (raylet, ip=192.168.1.43) (raylet, ip=192.168.1.43) Refer to the documentation on how to address the out of memory issue: https://docs.ray.io/en/latest/ray-core/scheduling/ray-oom-prevention.html. Consider provisioning more memory on this node or reducing task parallelism by requesting more CPUs per task. To adjust the kill threshold, set the environment variable `RAY_memory_usage_threshold` when starting Ray. To disable worker killing, set the environment variable `RAY_memory_monitor_refresh_ms` to zero. (raylet) [2023-05-22 16:50:28,562 E 1019343 1019343] (raylet) node_manager.cc:3040: 9 Workers (tasks / actors) killed due to memory pressure (OOM), 0 Workers crashed due to other reasons at node (ID: 244f4e340443f74ab302822376195329f45858b65998b4a77f5f4732, IP: 192.168.1.35) over the last time period. To see more information about the Workers killed on this node, use `ray logs raylet.out -ip 192.168.1.35` (raylet) (raylet) Refer to the documentation on how to address the out of memory issue: https://docs.ray.io/en/latest/ray-core/scheduling/ray-oom-prevention.html. Consider provisioning more memory on this node or reducing task parallelism by requesting more CPUs per task. To adjust the kill threshold, set the environment variable `RAY_memory_usage_threshold` when starting Ray. To disable worker killing, set the environment variable `RAY_memory_monitor_refresh_ms` to zero.
56.632163524627686
applyCl.diskScan("~/ssd2/test")
------------------------------------------------------------ Distributed folders ------------------------------------------------------------ Path Total size Size on each node (node id and thread count) 51306, 32 thr 244f4, 16 thr ef957, 8 thr bb09a, 8 thr ---------------------------------------- ---------- ------------ ------------ ------------ ------------ /home/kelvin/ssd2/test/dillPerf 7.16 GB 3.53 GB 1.81 GB 873.77 MB 946.0 MB /home/kelvin/ssd2/test/balanceFolder 6.06 GB 2.99 GB 1.53 GB 738.67 MB 799.72 MB A distributed folder is a folder that has many files and folders inside, but their names are all different from each other. It's managed by applyCl.balanceFolder()
Reading text files:
#notest
with k1.timer() as t: None | applyCl.aS(lambda: ls(base)) | ungroup() | applyCl(lambda fn: cat(fn) | shape(0), pre=True) | deref()
t()
2.0585927963256836
Reading pickled files:
#notest
with k1.timer() as t: None | applyCl.aS(lambda: ls(base2)) | ungroup() | applyCl(lambda fn: cat(fn, False) | shape(0), pre=True) | deref()
t()
0.2251725196838379
10x faster. Nice. So pickle-loading from disk is not too taxing afterall.
from k1lib.imports import *
c32, c16, c81, c82 = None | applyCl.aS(lambda: applyCl.cpu()) | ~sort(1) | cut(0) | deref()
2023-05-23 16:17:01,732 INFO worker.py:1364 -- Connecting to existing Ray cluster at address: 192.168.1.35:6379...
2023-05-23 16:17:01,737 INFO worker.py:1544 -- Connected to Ray cluster. View the dashboard at 127.0.0.1:8265
def run(nodeId, n=100):
data = []
for i in range(1, n):
with k1.timer() as t: nodeId | repeat(i) | insertIdColumn(begin=False) | applyCl(lambda x: range(1_000_000_000) | toSum(), pre=True, timeout=3600) | deref()
data.append([i, t()])
return data
#notest
def run2(nodeId, n, fn): run(nodeId, n) | aS(dill.dumps) | file(fn)
ts = [[c32, 144, "c32.pth"], [c16, 48, "c16.pth"], [c81, 24, "c81.pth"], [c82, 24, "c82.pth"]] | apply(lambda args: threading.Thread(target=run2, args=args)) | deref()
ts | op().start().all() | deref(); ts | op().join(3600).all() | deref()
[None, None, None, None]
c32, c16, c81, c82 = cs = ["c32", "c16", "c81", "c82"] | apply(op() + ".pth") | apply(cat, text=False) | apply(dill.loads) | deref()
labels = ["32 threads", "16 thread", "8 thread", "8 thread"]
cs | apply(head(50) | transpose() | ~aS(plt.plot)) | ignore()
plt.grid(True); plt.xlabel("Threads"); plt.ylabel("Running time (seconds)"); plt.legend(labels);
Normalizing by thread/core:
fig, axes = plt.subplots(1, 2, figsize=(10, 4))
plt.sca(axes[0]); [[32, 16, 8, 8], cs] | transpose() | ~apply(lambda cores, data: data | apply(op()/cores, 0)) | apply(transpose() | ~aS(plt.plot)) | ignore()
plt.grid(True); plt.legend(labels); plt.xlabel("Workload/thread"); plt.ylabel("Running time (seconds)");
plt.sca(axes[1]); [[24, 8, 4, 4], cs] | transpose() | ~apply(lambda cores, data: data | apply(op()/cores, 0)) | apply(transpose() | ~aS(plt.plot)) | ignore()
plt.grid(True); plt.legend(labels); plt.xlabel("Workload/core"); plt.ylabel("Running time (seconds)"); plt.tight_layout()
The 13900k is so much faster than the 7700. It's quite amazing. You can clearly see the improvement every cpu generations. Let's see if we can sort of see the effective number of cores for each computer:
[[42, 10, 4, 4], cs] | transpose() | ~apply(lambda cores, data: data | apply(op()/cores, 0)) | apply(transpose() | ~aS(plt.plot)) | ignore()
plt.grid(True); plt.legend(labels); plt.xlabel("Workload/core"); plt.ylabel("Running time (seconds)"); plt.xlim(0, 6); plt.ylim(0, 100); plt.tight_layout()
This is so useful that I've integrated load testing into applyCl:
from k1lib.cli import _applyCl
_applyCl.load_loadTest()
{'gASVJAAAAAAAAABDIMK2cOEX3eEqdZryXbOPetsY4VB0Yqxg7LmgjiId9ME5lC4=': 2.4737823688848835, 'gASVJAAAAAAAAABDIDtamfDeW78RkAQreAVqSViFxvdqnnLiYDM3Bf3dHzhJlC4=': 1.1354472459235292, 'gASVJAAAAAAAAABDIAPlCMRdwqeu2524mRPJg4gdG+dSgBSEdf9htIujfuZKlC4=': 1.0150925540272815}
13900k is 2.5x as powerful as the 7700 per thread. Impressive.