1 .. This file has "irst" as an extension to ensure that it's not parsed by sphinx as is. Instead, it's included in another file that is parsed.
10 This tutorial presents how to perform faithful IO experiments in
11 SimGrid. It is based on the paper "Adding Storage Simulation
12 Capacities to the SimGridToolkit: Concepts, Models, and API".
14 The paper presents a series of experiments to analyze the performance
15 of IO operations (read/write) on different kinds of disks (SATA, SAS,
16 SSD). In this tutorial, we present a detailed example of how to
17 extract experimental data to simulate: i) performance degradation
18 with concurrent operations (Fig. 8 in the paper) and ii) variability
19 in IO operations (Fig. 5 to 7).
21 - Link for paper: `https://hal.inria.fr/hal-01197128 <https://hal.inria.fr/hal-01197128>`_
23 - Link for data: `https://figshare.com/articles/dataset/Companion_of_the_SimGrid_storage_modeling_article/1175156 <https://figshare.com/articles/dataset/Companion_of_the_SimGrid_storage_modeling_article/1175156>`_
27 - The purpose of this document is to illustrate how we can
28 extract data from experiments and inject on SimGrid. However, the
29 data shown on this page may **not** reflect the reality.
31 - You must run similar experiments on your hardware to get realistic
32 data for your context.
34 - SimGrid has been in active development since the paper release in
35 2015, thus the XML description used in the paper may have evolved
36 while MSG was superseeded by S4U since then.
41 A Dockerfile is available in ``docs/source/tuto_disk``. It allows you to
42 re-run this tutorial. For that, build the image and run the container:
44 - ``docker build -t tuto_disk .``
46 - ``docker run -it tuto_disk``
48 Analyzing the experimental data
49 ===============================
51 We start by analyzing and extracting the real data available.
56 We use a special method to create non-uniform histograms to represent
57 the noise in IO operations.
59 Unable to install the library properly, I copied the important methods
62 Copied from: `https://rdrr.io/github/dlebauer/pecan-priors/src/R/plots.R <https://rdrr.io/github/dlebauer/pecan-priors/src/R/plots.R>`_
67 Some initial configurations/list of packages.
82 Use suppressPackageStartupMessages() to eliminate package startup
85 Attaching package: 'dplyr'
87 The following objects are masked from 'package:plyr':
89 arrange, count, desc, failwith, id, mutate, rename, summarise,
92 The following objects are masked from 'package:stats':
96 The following objects are masked from 'package:base':
98 intersect, setdiff, setequal, union
100 Attaching package: 'gridExtra'
102 The following object is masked from 'package:dplyr':
106 This was copied from the ``sg_storage_ccgrid15.org`` available at the
107 figshare of the paper. Before executing this code, please download and
108 decompress the appropriate file.
112 curl -O -J -L "https://ndownloader.figshare.com/files/1928095"
115 Preparing data for varialiby analysis.
120 clean_up <- function (df, infra){
121 names(df) <- c("Hostname","Date","DirectIO","IOengine","IOscheduler","Error","Operation","Jobs","BufferSize","FileSize","Runtime","Bandwidth","BandwidthMin","BandwidthMax","Latency", "LatencyMin", "LatencyMax","IOPS")
122 df=subset(df,Error=="0")
123 df=subset(df,DirectIO=="1")
124 df <- merge(df,infra,by="Hostname")
125 df$Hostname = sapply(strsplit(df$Hostname, "[.]"), "[", 1)
126 df$HostModel = paste(df$Hostname, df$Model, sep=" - ")
127 df$Duration = df$Runtime/1000 # fio outputs runtime in msec, we want to display seconds
128 df$Size = df$FileSize/1024/1024
129 df=subset(df,Duration!=0.000)
130 df$Bwi=df$Duration/df$Size
131 df[df$Operation=="read",]$Operation<- "Read"
132 df[df$Operation=="write",]$Operation<- "Write"
136 grenoble <- read.csv('./bench/grenoble.csv', header=FALSE,sep = ";",
137 stringsAsFactors=FALSE)
138 luxembourg <- read.csv('./bench/luxembourg.csv', header=FALSE,sep = ";", stringsAsFactors=FALSE)
139 nancy <- read.csv('./bench/nancy.csv', header=FALSE,sep = ";", stringsAsFactors=FALSE)
140 all <- rbind(grenoble,nancy, luxembourg)
141 infra <- read.csv('./bench/infra.csv', header=FALSE,sep = ";", stringsAsFactors=FALSE)
142 names(infra) <- c("Hostname","Model","DiskSize")
144 all = clean_up(all, infra)
145 griffon = subset(all,grepl("^griffon", Hostname))
146 griffon$Cluster <-"Griffon (SATA II)"
147 edel = subset(all,grepl("^edel", Hostname))
148 edel$Cluster<-"Edel (SSD)"
150 df = rbind(griffon[griffon$Jobs=="1" & griffon$IOscheduler=="cfq",],
151 edel[edel$Jobs=="1" & edel$IOscheduler=="cfq",])
152 #Get rid off of 64 Gb disks of Edel as they behave differently (used to be "edel-51")
153 df = df[!(grepl("^Edel",df$Cluster) & df$DiskSize=="64 GB"),]
155 Preparing data for concurrent analysis.
159 dfc = rbind(griffon[griffon$Jobs>1 & griffon$IOscheduler=="cfq",],
160 edel[edel$Jobs>1 & edel$IOscheduler=="cfq",])
161 dfc2 = rbind(griffon[griffon$Jobs==1 & griffon$IOscheduler=="cfq",],
162 edel[edel$Jobs==1 & edel$IOscheduler=="cfq",])
163 dfc = rbind(dfc,dfc2[sample(nrow(dfc2),size=200),])
167 Date = NA, #tmpl$Date,
172 Operation = NA, #tmpl$Operation,
173 Jobs = NA, # #d$nb.of.concurrent.access,
174 BufferSize = NA, #d$bs,
175 FileSize = NA, #d$size,
184 Model = NA, #tmpl$Model,
185 DiskSize = NA, #tmpl$DiskSize,
187 Duration = NA, #d$time,
190 Cluster = NA) #tmpl$Cluster)
192 dd$Size = dd$FileSize/1024/1024
193 dd$Bwi = dd$Duration/dd$Size
196 # Let's get rid of small files!
197 dfc = subset(dfc,Size >= 10)
198 # Let's get rid of 64Gb edel disks
199 dfc = dfc[!(grepl("^Edel",dfc$Cluster) & dfc$DiskSize=="64 GB"),]
201 dfc$TotalSize=dfc$Size * dfc$Jobs
202 dfc$BW = (dfc$TotalSize) / dfc$Duration
203 dfc = dfc[dfc$BW>=20,] # get rid of one point that is typically an outlier and does not make sense
206 dfc[dfc$Cluster=="Edel (SSD)" & dfc$Operation=="Read",]$method="loess"
208 dfc[dfc$Cluster=="Edel (SSD)" & dfc$Operation=="Write" & dfc$Jobs ==1,]$method="lm"
209 dfc[dfc$Cluster=="Edel (SSD)" & dfc$Operation=="Write" & dfc$Jobs ==1,]$method=""
211 dfc[dfc$Cluster=="Griffon (SATA II)" & dfc$Operation=="Write",]$method="lm"
212 dfc[dfc$Cluster=="Griffon (SATA II)" & dfc$Operation=="Write" & dfc$Jobs ==1,]$method=""
214 dfd = dfc[dfc$Operation=="Write" & dfc$Jobs ==1 &
215 (dfc$Cluster %in% c("Griffon (SATA II)", "Edel (SSD)")),]
216 dfd = ddply(dfd,c("Cluster","Operation","Jobs","DiskSize"), summarize,
217 mean = mean(BW), num = length(BW), sd = sd(BW))
219 dfd$ci = 2*dfd$sd/sqrt(dfd$num)
221 dfrange=ddply(dfc,c("Cluster","Operation","DiskSize"), summarize,
223 dfrange=ddply(dfrange,c("Cluster","DiskSize"), mutate,
230 Modeling resource sharing w/ concurrent access
231 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
233 This figure presents the overall performance of IO operation with
234 concurrent access to the disk. Note that the image is different
235 from the one in the paper. Probably, we need to further clean the
236 available data to obtain exaclty the same results.
240 ggplot(data=dfc,aes(x=Jobs,y=BW, color=Operation)) + theme_bw() +
241 geom_point(alpha=.3) +
242 geom_point(data=dfrange, size=0) +
243 facet_wrap(Cluster~Operation,ncol=2,scale="free_y")+ # ) + #
244 geom_smooth(data=dfc[dfc$method=="loess",], color="black", method=loess,se=TRUE,fullrange=T) +
245 geom_smooth(data=dfc[dfc$method=="lm",], color="black", method=lm,se=TRUE) +
246 geom_point(data=dfd, aes(x=Jobs,y=BW),color="black",shape=21,fill="white") +
247 geom_errorbar(data=dfd, aes(x=Jobs, ymin=BW-ci, ymax=BW+ci),color="black",width=.6) +
248 xlab("Number of concurrent operations") + ylab("Aggregated Bandwidth (MiB/s)") + guides(color=FALSE) + xlim(0,NA) + ylim(0,NA)
250 .. image:: tuto_disk/fig/griffon_deg.png
255 Getting read data for Griffon from 1 to 15 concurrent reads.
259 deg_griffon = dfc %>% filter(grepl("^Griffon", Cluster)) %>% filter(Operation == "Read")
260 model = lm(BW~Jobs, data = deg_griffon)
261 IO_INFO[["griffon"]][["degradation"]][["read"]] = predict(model,data.frame(Jobs=seq(1,15)))
263 toJSON(IO_INFO, pretty = TRUE)
271 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575]
279 Same for write operations.
283 deg_griffon = dfc %>% filter(grepl("^Griffon", Cluster)) %>% filter(Operation == "Write") %>% filter(Jobs > 2)
284 mean_job_1 = dfc %>% filter(grepl("^Griffon", Cluster)) %>% filter(Operation == "Write") %>% filter(Jobs == 1) %>% summarize(mean = mean(BW))
285 model = lm(BW~Jobs, data = deg_griffon)
286 IO_INFO[["griffon"]][["degradation"]][["write"]] = c(mean_job_1$mean, predict(model,data.frame(Jobs=seq(2,15))))
287 toJSON(IO_INFO, pretty = TRUE)
295 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
296 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
301 Modeling read/write bandwidth variability
302 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
304 Fig.5 in the paper presents the noise in the read/write operations in
305 the Griffon SATA disk.
307 The paper uses regular histogram to illustrate the distribution of the
308 effective bandwidth. However, in this tutorial, we use dhist
309 (`https://rdrr.io/github/dlebauer/pecan-priors/man/dhist.html <https://rdrr.io/github/dlebauer/pecan-priors/man/dhist.html>`_) to have a
310 more precise information over the highly dense areas around the mean.
315 First, we present the histogram for read operations.
319 griffon_read = df %>% filter(grepl("^Griffon", Cluster)) %>% filter(Operation == "Read") %>% select(Bwi)
320 dhist(1/griffon_read$Bwi)
322 .. image:: tuto_disk/fig/griffon_read_dhist.png
324 Saving it to be exported in json format.
328 griffon_read_dhist = dhist(1/griffon_read$Bwi, plot=FALSE)
329 IO_INFO[["griffon"]][["noise"]][["read"]] = c(breaks=list(griffon_read_dhist$xbr), heights=list(unclass(griffon_read_dhist$heights)))
330 IO_INFO[["griffon"]][["read_bw"]] = mean(1/griffon_read$Bwi)
331 toJSON(IO_INFO, pretty = TRUE)
336 In hist.default(x, breaks = cut.pt, plot = FALSE, probability = TRUE) :
337 argument 'probability' is not made use of
342 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
343 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
347 "breaks": [39.257, 51.3413, 60.2069, 66.8815, 71.315, 74.2973, 80.8883, 95.1944, 109.6767, 125.0231, 140.3519, 155.6807, 171.0094, 186.25],
348 "heights": [15.3091, 41.4578, 73.6826, 139.5982, 235.125, 75.3357, 4.1241, 3.3834, 0, 0.0652, 0.0652, 0.0652, 0.3937]
358 Same analysis for write operations.
362 griffon_write = df %>% filter(grepl("^Griffon", Cluster)) %>% filter(Operation == "Write") %>% select(Bwi)
363 dhist(1/griffon_write$Bwi)
365 .. image:: tuto_disk/fig/griffon_write_dhist.png
369 griffon_write_dhist = dhist(1/griffon_write$Bwi, plot=FALSE)
370 IO_INFO[["griffon"]][["noise"]][["write"]] = c(breaks=list(griffon_write_dhist$xbr), heights=list(unclass(griffon_write_dhist$heights)))
371 IO_INFO[["griffon"]][["write_bw"]] = mean(1/griffon_write$Bwi)
372 toJSON(IO_INFO, pretty = TRUE)
377 In hist.default(x, breaks = cut.pt, plot = FALSE, probability = TRUE) :
378 argument 'probability' is not made use of
383 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
384 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
388 "breaks": [39.257, 51.3413, 60.2069, 66.8815, 71.315, 74.2973, 80.8883, 95.1944, 109.6767, 125.0231, 140.3519, 155.6807, 171.0094, 186.25],
389 "heights": [15.3091, 41.4578, 73.6826, 139.5982, 235.125, 75.3357, 4.1241, 3.3834, 0, 0.0652, 0.0652, 0.0652, 0.3937]
392 "breaks": [5.2604, 21.0831, 31.4773, 39.7107, 45.5157, 50.6755, 54.4726, 59.7212, 67.8983, 81.2193, 95.6333, 111.5864, 127.8409, 144.3015],
393 "heights": [1.7064, 22.6168, 38.613, 70.8008, 84.4486, 128.5118, 82.3692, 39.1431, 9.2256, 5.6195, 1.379, 0.6429, 0.1549]
396 "read_bw": [68.5425],
397 "write_bw": [50.6045]
404 This section presents the exactly same analysis for the Edel SSDs.
406 Modeling resource sharing w/ concurrent access
407 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
412 Getting read data for Edel from 1 to 15 concurrent operations.
416 deg_edel = dfc %>% filter(grepl("^Edel", Cluster)) %>% filter(Operation == "Read")
417 model = loess(BW~Jobs, data = deg_edel)
418 IO_INFO[["edel"]][["degradation"]][["read"]] = predict(model,data.frame(Jobs=seq(1,15)))
419 toJSON(IO_INFO, pretty = TRUE)
427 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
428 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
432 "breaks": [39.257, 51.3413, 60.2069, 66.8815, 71.315, 74.2973, 80.8883, 95.1944, 109.6767, 125.0231, 140.3519, 155.6807, 171.0094, 186.25],
433 "heights": [15.3091, 41.4578, 73.6826, 139.5982, 235.125, 75.3357, 4.1241, 3.3834, 0, 0.0652, 0.0652, 0.0652, 0.3937]
436 "breaks": [5.2604, 21.0831, 31.4773, 39.7107, 45.5157, 50.6755, 54.4726, 59.7212, 67.8983, 81.2193, 95.6333, 111.5864, 127.8409, 144.3015],
437 "heights": [1.7064, 22.6168, 38.613, 70.8008, 84.4486, 128.5118, 82.3692, 39.1431, 9.2256, 5.6195, 1.379, 0.6429, 0.1549]
440 "read_bw": [68.5425],
441 "write_bw": [50.6045]
445 "read": [150.5119, 167.4377, 182.2945, 195.1004, 205.8671, 214.1301, 220.411, 224.6343, 227.7141, 230.6843, 233.0923, 235.2027, 236.8369, 238.0249, 238.7515]
453 Same for write operations.
457 deg_edel = dfc %>% filter(grepl("^Edel", Cluster)) %>% filter(Operation == "Write") %>% filter(Jobs > 2)
458 mean_job_1 = dfc %>% filter(grepl("^Edel", Cluster)) %>% filter(Operation == "Write") %>% filter(Jobs == 1) %>% summarize(mean = mean(BW))
459 model = lm(BW~Jobs, data = deg_edel)
460 IO_INFO[["edel"]][["degradation"]][["write"]] = c(mean_job_1$mean, predict(model,data.frame(Jobs=seq(2,15))))
461 toJSON(IO_INFO, pretty = TRUE)
469 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
470 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
474 "breaks": [39.257, 51.3413, 60.2069, 66.8815, 71.315, 74.2973, 80.8883, 95.1944, 109.6767, 125.0231, 140.3519, 155.6807, 171.0094, 186.25],
475 "heights": [15.3091, 41.4578, 73.6826, 139.5982, 235.125, 75.3357, 4.1241, 3.3834, 0, 0.0652, 0.0652, 0.0652, 0.3937]
478 "breaks": [5.2604, 21.0831, 31.4773, 39.7107, 45.5157, 50.6755, 54.4726, 59.7212, 67.8983, 81.2193, 95.6333, 111.5864, 127.8409, 144.3015],
479 "heights": [1.7064, 22.6168, 38.613, 70.8008, 84.4486, 128.5118, 82.3692, 39.1431, 9.2256, 5.6195, 1.379, 0.6429, 0.1549]
482 "read_bw": [68.5425],
483 "write_bw": [50.6045]
487 "read": [150.5119, 167.4377, 182.2945, 195.1004, 205.8671, 214.1301, 220.411, 224.6343, 227.7141, 230.6843, 233.0923, 235.2027, 236.8369, 238.0249, 238.7515],
488 "write": [132.2771, 170.174, 170.137, 170.1, 170.063, 170.026, 169.9889, 169.9519, 169.9149, 169.8779, 169.8408, 169.8038, 169.7668, 169.7298, 169.6927]
493 Modeling read/write bandwidth variability
494 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
501 edel_read = df %>% filter(grepl("^Edel", Cluster)) %>% filter(Operation == "Read") %>% select(Bwi)
502 dhist(1/edel_read$Bwi)
504 .. image:: tuto_disk/fig/edel_read_dhist.png
506 Saving it to be exported in json format.
510 edel_read_dhist = dhist(1/edel_read$Bwi, plot=FALSE)
511 IO_INFO[["edel"]][["noise"]][["read"]] = c(breaks=list(edel_read_dhist$xbr), heights=list(unclass(edel_read_dhist$heights)))
512 IO_INFO[["edel"]][["read_bw"]] = mean(1/edel_read$Bwi)
513 toJSON(IO_INFO, pretty = TRUE)
518 In hist.default(x, breaks = cut.pt, plot = FALSE, probability = TRUE) :
519 argument 'probability' is not made use of
524 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
525 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
529 "breaks": [39.257, 51.3413, 60.2069, 66.8815, 71.315, 74.2973, 80.8883, 95.1944, 109.6767, 125.0231, 140.3519, 155.6807, 171.0094, 186.25],
530 "heights": [15.3091, 41.4578, 73.6826, 139.5982, 235.125, 75.3357, 4.1241, 3.3834, 0, 0.0652, 0.0652, 0.0652, 0.3937]
533 "breaks": [5.2604, 21.0831, 31.4773, 39.7107, 45.5157, 50.6755, 54.4726, 59.7212, 67.8983, 81.2193, 95.6333, 111.5864, 127.8409, 144.3015],
534 "heights": [1.7064, 22.6168, 38.613, 70.8008, 84.4486, 128.5118, 82.3692, 39.1431, 9.2256, 5.6195, 1.379, 0.6429, 0.1549]
537 "read_bw": [68.5425],
538 "write_bw": [50.6045]
542 "read": [150.5119, 167.4377, 182.2945, 195.1004, 205.8671, 214.1301, 220.411, 224.6343, 227.7141, 230.6843, 233.0923, 235.2027, 236.8369, 238.0249, 238.7515],
543 "write": [132.2771, 170.174, 170.137, 170.1, 170.063, 170.026, 169.9889, 169.9519, 169.9149, 169.8779, 169.8408, 169.8038, 169.7668, 169.7298, 169.6927]
547 "breaks": [104.1667, 112.3335, 120.5003, 128.6671, 136.8222, 144.8831, 149.6239, 151.2937, 154.0445, 156.3837, 162.3555, 170.3105, 178.3243],
548 "heights": [0.1224, 0.1224, 0.1224, 0.2452, 1.2406, 61.6128, 331.2201, 167.6488, 212.1086, 31.3996, 2.3884, 1.747]
551 "read_bw": [152.7139]
561 edel_write = df %>% filter(grepl("^Edel", Cluster)) %>% filter(Operation == "Write") %>% select(Bwi)
562 dhist(1/edel_write$Bwi)
564 .. image:: tuto_disk/fig/edel_write_dhist.png
566 Saving it to be exported later.
570 edel_write_dhist = dhist(1/edel_write$Bwi, plot=FALSE)
571 IO_INFO[["edel"]][["noise"]][["write"]] = c(breaks=list(edel_write_dhist$xbr), heights=list(unclass(edel_write_dhist$heights)))
572 IO_INFO[["edel"]][["write_bw"]] = mean(1/edel_write$Bwi)
573 toJSON(IO_INFO, pretty = TRUE)
578 In hist.default(x, breaks = cut.pt, plot = FALSE, probability = TRUE) :
579 argument 'probability' is not made use of
584 "read": [66.6308, 64.9327, 63.2346, 61.5365, 59.8384, 58.1403, 56.4423, 54.7442, 53.0461, 51.348, 49.6499, 47.9518, 46.2537, 44.5556, 42.8575],
585 "write": [49.4576, 26.5981, 27.7486, 28.8991, 30.0495, 31.2, 32.3505, 33.501, 34.6515, 35.8019, 36.9524, 38.1029, 39.2534, 40.4038, 41.5543]
589 "breaks": [39.257, 51.3413, 60.2069, 66.8815, 71.315, 74.2973, 80.8883, 95.1944, 109.6767, 125.0231, 140.3519, 155.6807, 171.0094, 186.25],
590 "heights": [15.3091, 41.4578, 73.6826, 139.5982, 235.125, 75.3357, 4.1241, 3.3834, 0, 0.0652, 0.0652, 0.0652, 0.3937]
593 "breaks": [5.2604, 21.0831, 31.4773, 39.7107, 45.5157, 50.6755, 54.4726, 59.7212, 67.8983, 81.2193, 95.6333, 111.5864, 127.8409, 144.3015],
594 "heights": [1.7064, 22.6168, 38.613, 70.8008, 84.4486, 128.5118, 82.3692, 39.1431, 9.2256, 5.6195, 1.379, 0.6429, 0.1549]
597 "read_bw": [68.5425],
598 "write_bw": [50.6045]
602 "read": [150.5119, 167.4377, 182.2945, 195.1004, 205.8671, 214.1301, 220.411, 224.6343, 227.7141, 230.6843, 233.0923, 235.2027, 236.8369, 238.0249, 238.7515],
603 "write": [132.2771, 170.174, 170.137, 170.1, 170.063, 170.026, 169.9889, 169.9519, 169.9149, 169.8779, 169.8408, 169.8038, 169.7668, 169.7298, 169.6927]
607 "breaks": [104.1667, 112.3335, 120.5003, 128.6671, 136.8222, 144.8831, 149.6239, 151.2937, 154.0445, 156.3837, 162.3555, 170.3105, 178.3243],
608 "heights": [0.1224, 0.1224, 0.1224, 0.2452, 1.2406, 61.6128, 331.2201, 167.6488, 212.1086, 31.3996, 2.3884, 1.747]
611 "breaks": [70.9593, 79.9956, 89.0654, 98.085, 107.088, 115.9405, 123.5061, 127.893, 131.083, 133.6696, 135.7352, 139.5932, 147.4736],
612 "heights": [0.2213, 0, 0.3326, 0.4443, 1.4685, 11.8959, 63.869, 110.286, 149.9741, 202.887, 80.8298, 9.0298]
615 "read_bw": [152.7139],
616 "write_bw": [131.7152]
623 Finally, let's save it to a file to be opened by our simulator.
627 json = toJSON(IO_INFO, pretty = TRUE)
628 cat(json, file="IO_noise.json")
630 Injecting this data in SimGrid
631 ==============================
633 To mimic this behavior in SimGrid, we use two features in the platform
634 description: non-linear sharing policy and bandwidth factors. For more
635 details, please see the source code in ``tuto_disk.cpp``.
637 Modeling resource sharing w/ concurrent access
638 ----------------------------------------------
640 The ``set_sharing_policy`` method allows the user to set a callback to
641 dynamically change the disk capacity. The callback is called each time
642 SimGrid will share the disk between a set of I/O operations.
644 The callback has access to the number of activities sharing the
645 resource and its current capacity. It must return the new resource's
650 static double disk_dynamic_sharing(double capacity, int n)
652 return capacity; //useless callback
655 auto* disk = host->create_disk("dump", 1e6, 1e6);
656 disk->set_sharing_policy(sg4::Disk::Operation::READ, sg4::Disk::SharingPolicy::NONLINEAR, &disk_dynamic_sharing);
658 Modeling read/write bandwidth variability
659 -----------------------------------------
661 The noise in I/O operations can be obtained by applying a factor to
662 the I/O bandwidth of the disk. This factor is applied when we update
663 the remaining amount of bytes to be transferred, increasing or
664 decreasing the effective disk bandwidth.
666 The ``set_factor`` method allows the user to set a callback to
667 dynamically change the factor to be applied for each I/O operation.
668 The callback has access to size of the operation and its type (read or
669 write). It must return a multiply factor (e.g. 1.0 for doing nothing).
673 static double disk_variability(sg_size_t size, sg4::Io::OpType op)
675 return 1.0; //useless callback
678 auto* disk = host->create_disk("dump", 1e6, 1e6);
679 disk->set_factor_cb(&disk_variability);
681 Running our simulation
682 ----------------------
684 The binary was compiled in the provided docker container.
688 ./tuto_disk > ./simgrid_disk.csv
690 Analyzing the SimGrid results
691 =============================
693 The figure below presents the results obtained by SimGrid.
695 The experiment performs I/O operations, varying the number of
696 concurrent operations from 1 to 15. We run only 20 simulations for
699 We can see that the graphics are quite similar to the ones obtained in
704 sg_df = read.csv("./simgrid_disk.csv")
705 sg_df = sg_df %>% group_by(disk, op, flows) %>% mutate(bw=((size*flows)/elapsed)/10^6, method=if_else(disk=="edel" & op=="read", "loess", "lm"))
706 sg_dfd = sg_df %>% filter(flows==1 & op=="write") %>% group_by(disk, op, flows) %>% summarize(mean = mean(bw), sd = sd(bw), se=sd/sqrt(n()))
708 sg_df[sg_df$op=="write" & sg_df$flows ==1,]$method=""
710 ggplot(data=sg_df, aes(x=flows, y=bw, color=op)) + theme_bw() +
711 geom_point(alpha=.3) +
712 geom_smooth(data=sg_df[sg_df$method=="loess",], color="black", method=loess,se=TRUE,fullrange=T) +
713 geom_smooth(data=sg_df[sg_df$method=="lm",], color="black", method=lm,se=TRUE) +
714 geom_errorbar(data=sg_dfd, aes(x=flows, y=mean, ymin=mean-2*se, ymax=mean+2*se),color="black",width=.6) +
715 facet_wrap(disk~op,ncol=2,scale="free_y")+ # ) + #
716 xlab("Number of concurrent operations") + ylab("Aggregated Bandwidth (MiB/s)") + guides(color=FALSE) + xlim(0,NA) + ylim(0,NA)
718 .. image:: tuto_disk/fig/simgrid_results.png
720 Note: The variability in griffon read operation seems to decrease when
721 we have more concurrent operations. This is a particularity of the
722 griffon read speed profile and the elapsed time calculation.
726 - Each point represents the time to perform the N I/O operations.
728 - Griffon read speed decreases with the number of concurrent
731 With 15 read operations:
733 - At the beginning, every read gets the same bandwidth, about
736 - We sample the noise in I/O operations, some will be faster than
737 others (e.g. factor > 1).
739 When the first read operation finish:
741 - We will recalculate the bandwidth sharing, now considering that we
742 have 14 active read operations. This will increase the bandwidth for
743 each operation (about 44MiB/s).
745 - The remaining "slower" activities will be speed up.
747 This behavior keeps happening until the end of the 15 operations,
748 at each step, we speed up a little the slowest operations and
749 consequently, decreasing the variability we see.