Data360 Analyze

 View Only
  • 1.  Importing pure python packages into Analyze's transform nodes

    Posted 03-02-2020 04:49

    I understand that many Python modules cannot be used within D360 Analyze's nodes because of the former's dependencies on C/C++ libraries.

    But can a pure Python library be installed somewhere within D360's Jython directory structure (e.g. in platform\windows-x86-64\jython\Lib\site-packages)? Were this possible, I could use these packages within a Transform node.

    I couldn't find a Jython pip to do the install. And running jython.exe in the Windows command shell results in an error, when I attempted 

    jython -m pip install <module>

    Traceback (most recent call last):
    File "jython.py", line 535, in <module>
    File "jython.py", line 526, in main
    File "subprocess.py", line 168, in call
    File "subprocess.py", line 390, in __init__
    File "subprocess.py", line 640, in _execute_child
    WindowsError: [Error 2] The system cannot find the file specified
    Failed to execute script jython



  • 2.  RE: Importing pure python packages into Analyze's transform nodes

    Employee
    Posted 03-03-2020 03:33

    Re. the cmd shell error: The Jython instance that is bundled with Data360 Analyze is not intended or configured to provide shell access. The above error is not associated with attempting to use the pip module as it would also be generated by commands such as jython --version

    Before commenting on your other questions, I need to state the following caveats:

    • Only modules that are part of the Jython standard library or are shipped as part of the Data360 Analyze are covered by the Infogix support policy. 
    • When an Analyze instance is upgraded, any packages installed into the Jython site-packages directory will be lost. For this reason it is recommended that any custom packages are installed in a directory outside of the Analyze installation directory. The directory used for the local repository would not be on the Python library path so it is necessary to add the directory to the path for all nodes that utilize locally installed packages.
    • When a package is installed by pip the normal operation is to also attempt to install any dependent packages. The primary package you want to install could be a pure Python package but the package may depend on other packages that are written in C/C++ which can lead to an installation failure. Installing non-pure Python packages on Linux typically requires the relevant compiler to be installed in the OS.
    • Not all packages have been written to the required standards and some packages may not operate  correctly when they are installed in a custom directory rather than the site-packages directory. Indeed, some packages may still fail to operate even when installed into the default site-packages directory.

     

    The version of Jython bundled with Analyze includes pip. This can be accessed from within the scripts of a Transform node. The following provides an example of how the pure-Python PyPDF2 package can be installed by leveraging pip within a Transform node:

     

     

    The pip module is not intended to be imported programmatically from within Jython. In the example ConfigureFields script above, pip is called using the subprocess module. 

    After importing the required libraries the script defines the target directory where the package is to be installed. The script then constructs the target parameter for the pip install command.

    A function is defined to call pip within a subprocess.

    After defining the name of the package to be installed the function is called to perform the installation, The standard output from the installation command is captured so it can be output by the node as confirmation of what was installed.

    The ProcessRecords script is then used to output the results.

     

    As previously mentioned, any node that needs to leverage a package installed into the local repository must include the path in the Jython library search path. Below is an example of how the repository's path can be appended to the search path:

     

     

     

    Attached files

    Jython Transform Node Script to Install a Pure-Python Package in a Custom Repository.txt
    Jython Transform Node Script to Import Package from Custom Repository.txt

     



  • 3.  RE: Importing pure python packages into Analyze's transform nodes

    Employee
    Posted 03-03-2020 08:10

    Note: in the above example the re package was imported.  The re package is not required for the installation and can be omitted.



  • 4.  RE: Importing pure python packages into Analyze's transform nodes

    Posted 03-03-2020 08:30

    Before I saw the above, what I did was:

    1. use the pip from my Python installation to download and install my desired pure Python package.

    2. it had no dependencies, so no worries there

    3. copied the package folder from the Python installation to the D360 Jython's site-packages folder

    4. set the path as you did above in your example Transform node, and then import the new package

    Clumsy, but it worked.

    Thanks for your response - very useful.



  • 5.  RE: Importing pure python packages into Analyze's transform nodes

    Employee
    Posted 03-03-2020 09:23

    I'm glad you found a solution that worked for you.

    I would still recommend using a directory outside of the Analyze install directory if possible as it would overcome the (almost inevitable) issue where the nodes that leverage the custom packages stop working after upgrading the Analyze software. 

    You may want to consider creating a target Jython site-packages directory within your Analyze instance's 'site' directory as this would automatically be retained when the software is upgraded (and the installed packages would be included in the system backups).

    The location of the site directory can be determined using the property substitution r"{{%ls.appDataDir%}}"  in your code.

     



  • 6.  RE: Importing pure python packages into Analyze's transform nodes

    Employee
    Posted 10-16-2020 03:23

    Update:

    As of Data360 Analyze v.3.6.4 the Analyze installer now creates a folder ('<site>/lib/jython2') within the installation’s ‘site’ directory that can be used to store 3rd party python packages for use with Jython-based nodes e.g. the Transform node and Generate Data node

    The new jython2 directory is automatically added to the Jython package search path so any packages installed in this directory will be available to the node.

     



  • 7.  RE: Importing pure python packages into Analyze's transform nodes

    Employee
    Posted 10-16-2020 04:16

    You can construct the path to the jython2 directory within a Transform node / Generate Data node using the following script snippet:

    siteDir = node.properties.getString("site", "ls.appDataDir")
    jython2Dir = siteDir + "/lib/jython2"