Virtualization within OpenMOLE
Virtualization has been implemented within OpenMOLE.
For instance, in this MOLE, the task "hello" launches a VM and execute commands inside this VM:
import org.openmole.core.implementation.capsule.*
import org.openmole.plugin.task.groovy.*
import org.openmole.core.implementation.mole.*
import org.openmole.plugin.task.systemexec.*
import org.openmole.plugin.resource.virtual.*
import org.openmole.core.implementation.data.*
import org.openmole.core.implementation.transition.*
fileForVM = new Prototype("fileForVM", File)
file = new Prototype("file", File)
generateFileForVM = new GroovyTask("generateFileFor")
generateFileForVM.setCode("fileForVM = workspace.newTmpFile();\nfileForVM.write('Hello from groovy');\n")
generateFileForVM.addOutput(fileForVM)
virtualMachine = new VirtualMachineResource("/home/reuillon/Documents/Tmp/qemu/lucid_mini_comp.img","root","toor")
hello = new VirtualSystemExecTask("hello",virtualMachine,"hostname ; pwd ; echo `cat fileForVM` and also hello from vm >file")
hello.exportFileFromContextAs(fileForVM, "fileForVM")
hello.importFileInContext(file, "file")
disp = new GroovyTask("disp")
disp.setCode("println 'read from groovy: ' + file.text")
disp.addInput(file)
generateFileForVMC = new TaskCapsule(generateFileForVM)
helloC = new TaskCapsule(hello)
dispC = new TaskCapsule(disp)
new SingleTransition(generateFileForVMC, helloC)
new SingleTransition(helloC, dispC)
new Mole(generateFileForVMC).run()
The result of this MOLE is:
ubuntu /root/fd394922-1e8d-441b-889b-8e052b8e43c0 read from groovy: Hello from groovy and also hello from vm
More testing has to be done.
For now OpenMOLE virtualization supports only GNU/Linux host. We need to compile qemu for Windows, MacOS, *BSD, Solaris and so on.
Next work will be to integrate legacy scientific software such as Scilab, Octave and R.
Scala(ble) plugin in OpenMOLE
Scala operates smoothly with Java and has some advantages of a cutting age language such as providing both functional and object oriented programming, scalability, types inference, concise syntax... Therefore we have integrated the possibility of extend OpenMOLE with plugin written in Scala or mixed Scala and Java.
The first example, is the DataSetDistributionTask plugin. I am pretty new to Scala (and functional programming) and for now the source code structure is close to a Java version, but Scala possibilities will be explored as my Scala knowledge will grow.
In the current Scala version (2.7) provides limited interoperability with Java collections. It will be tackled in the version 2.8
Mole example using a model in a plugin
In this example we want to distribute replications of a genetic algorithm for solving the TSP problem. The sources of this library are available here for browsing and here for checkout. The way to turn a maven project into an OSGi bundle and by consequence an OpenMOLE plugin is described here.
First import needed packages and load the platform plugins.
plugin.loadDir('openmole-plugins')
import org.openmole.core.workflow.implementation.data.*
import org.openmole.core.workflow.implementation.transition.*
import org.openmole.core.workflow.implementation.mole.Mole
import org.openmole.core.workflow.implementation.task.*
import org.openmole.core.workflow.implementation.capsule.*
import org.openmole.core.workflow.implementation.plan.*
import org.openmole.core.workflow.implementation.domain.*
import org.openmole.core.workflow.implementation.resource.*
import org.openmole.core.workflow.implementation.mole.execution.*
import org.openmole.plugin.plan.completeplan.*
import org.openmole.plugin.task.groovytask.*
import org.openmole.plugin.domain.interval.*
import org.openmole.plugin.environmentprovider.glite.*
import org.openmole.plugin.task.storeintocsvtask.*
Then define paths for future use.
tspPlugin = '/iscpif/users/reuillon/NetBeansProjects/tsp/target/tsp-1.0-SNAPSHOT.jar' tspDir = "/iscpif/users/reuillon/work/TSP" tspFilePath = new File(tspDir, "att48.tsp")
Load the user-crafted plugin containing the classes for solving the TSP.
plugin.load(tspPlugin) import org.openmole.tools.distrng.prng.* import org.openmole.tools.distrng.prng.parallelization.*
Configure a grid environment.
baseCheckoutDir = '/iscpif/users/reuillon/tmp/openmole/'
runtime = baseCheckoutDir + "runtime/org.openmole.runtime/target/org.openmole.runtime-0.3.tar.bz2"
desc = new GliteEnvironmentDescription("vo.iscpif.fr", "voms://grid12.lal.in2p3.fr:20013/O=GRID-FR/C=FR/O=CNRS/OU=LAL/CN=grid12.lal.in2p3.fr", "ldap://topbdii.grif.fr:2170")
env = desc.getMatching()
env.setRuntime(runtime)
Declare 3 variables. TspFile contains a description of the TSP problem to solve. The distance is a double containing the length of the best solution found by the algorithm. Rng contains a pseudo-random number generator (PRNG).
tspFile = new Prototype("tspFile", File)
distance = new Prototype("distance", Double)
rng = new Prototype("rng", IPRNG)
Configure the parallelization of the random number generator. This facility is provided by the DistRNG libraries of the OpenMOLETools libraries. The TSP model has been built on top of this library.
secureRandomRNG = new SecureRandomRNG() indexSequence = new IndexSequence(secureRandomRNG, WELL1024)
Define a complete plan for exploring 1000 independent state of the WELL1024 pseudo-random number generator.
plan = new CompletePlan() plan.addFactor(new Factor(rng, new SampledDomain(new IteratorDomain( indexSequence ), 1000)))
Build the exploration task.
explorationTask = new ExplorationTask("exploration", plan)
Define the task for launching the tsp solver. This task has a parameter: the file describing the TSP problem, an input: the PRNG, an output: the distance of the shorter path found by the genetic algorithm. It uses one resource: the plugin containing the TSP solving classes.
// Second task consumes the variable
tspTask = new GroovyTask("TSP task")
tspTask.addImport('fr.iscpif.tsp.*')
tspTask.addImport('org.openmole.tools.distrng.prng.*')
tspTask.setCode("tsp = new Tsp(tspFile); distance = tsp.computeShorterPath(1000000,2000000000,rng).getDistance()")
tspTask.addResource(new PluginResource(tspPlugin))
tspTask.addParameter(tspFile, tspFilePath)
tspTask.addInput(rng)
tspTask.addOutput(distance)
Define a task for storing the results in a CSV file. The result is an array of double because this task stands right after the aggregation transition.
//store the results
storeTask = new StoreIntoCSVTask("storeTask", tspDir + "/distances" + System.currentTimeMillis() +".csv")
storeTask.addColumn(distance.array())
Define the capsules.
explorationTaskCaps = new ExplorationTaskCapsule(explorationTask) tspTaskCaps = new TaskCapsule(tspTask) storeTaskCaps = new TaskCapsule(storeTask)
Define the transitions.
new ExplorationTransition(explorationTaskCaps, tspTaskCaps) new AggregationTransition(tspTaskCaps, storeTaskCaps) strat = new FixedEnvironmentStrategy()
Set the computational grid as the execution environment of the task capsule.
strat.setEnvironment(tspTaskCaps,env)
Build the Mole and execute it.
ex = new Mole(explorationTaskCaps).createExecution(strat) ex.start()
Data intensive simulation on EGEE with SimExplorer
Distribution of a data intensive simulation on EGEE (biomed) with the development version of SimExplorer (Workflow version).
data size : 33 MB of data / job
total : 50 GB of results
Note : SimExplorer, provides transparent data compression during file transfers. It hasn't exchanged 50 GB of raw data with the grid.
A simulation of in self propelled has been encapsulated into SimExplorer (workflow version) from a version running on a SGE (Sun Grid Engine) cluster. The integration effort has long last than an hour.
1562 jobs have been executed the following design :
design.addExperimentalFactor(new ExperimentalFactor<BigDecimal>(eta,new RangeBigDecimal("0.0","0.3","0.01")));
design.addExperimentalFactor(new ExperimentalFactor<BigDecimal>(angle,new RangeBigDecimal("0.0","5.0","0.1")));
The launching command is described has follow:
"./follow_me_2d_gaussianNnr 1045674222 1000 4 128 ${eta} ${angle} 50 0.01 1000000 fm_${eta}_${angle}.dat"
Each job runs in 45 minutes. The execution of the 1562 experiments would have require 48,8 day on a single computer (48,8 days / CPU). It has been executed in 9,8 hours on the biomed grid. The crunching factor is 118,7. The bottleneck was the slow Internet connection the experiment has been executed from (1 MB/s upstream speed transfer).

rss




