foreign-abi branch note 20220926 -------------------------------- I've updated it to OpenJDK 19 and it builds and clinfo runs, but since I don't have a working OpenCL on my Renior laptop even after a couple of years I'm not inclined to bother and i'm focusing on learning vulkan compute instead, albeit slowly. foreign-abi branch note ----------------------- This is still work in progress, expect breakage, out of date doco, other issues. Various native support code is in the temporary package api.*, eventually to be moved to a new nativez. I've added 'run-X' targets to run the demos. Arguments are passed using ARGV. make run-clinfo make run-Mandelbrot make run-Mandelbrot ARGV="--gamma=15 --rotate=0" foreign-abi branch TODO - api.Frame should probably be a SegmentAllocator - where pointers are passed in to native calls, MemoryAddress is used for the mid-level generated api - these should probably be Addressable to avoid the clutter of .address() invocations in the callee. - some apis take or use bytebuffer, maybe should just be MemorySegment INTRODUCTION ------------ This is a direct Java binding for OpenCL 2.1 and Java 13. I originally wrote it over a few days mostly just to pass the time and the goals were: * Minimal C and Java code; * Lightweight Java objects; * Feature complete for every OpenCL api; * Follow the OpenCL api closely; * Use primitive Java types instead of or as well as ByteBuffers where it makes sense; * Object-oriented and type-safe; * Robust and fault-tolerant; * Erring on the side of usability when constraints conflict; * Simple to build. The approach taken was to put most of the code into C so that 32/64-bit portability is provided soley by the non-java code. This simplifies the Java and doesn't add much complication to the C code. To this end a 64-bit type is used on all platforms to store or transfer values which may change across platforms (void * and size_t). The Java instances of each OpenCL object are effectively just empty interfaces - they hold nothing but the pointer to the underlying object and pass all function calls to the jni layer. All API enty points are defined including event callbacks and the ability to write native kernels in Java. As with any 'simple' project, it's grown to be somewhat complex but it is still basic and portable code which is easy to build and maintain. There have been a couple of important additions to the feature list: * Object references are now globally unqiue; * Automatic and manual resource reclaimation via garbage collection; * Lambda interfaces (experimental); * Runtime API version checking. These all go hand in hand and allow for simplified application development with a few caveats. Garbage detection is via referent queues for efficiency. PLATFORMS --------- As the binding is mostly native interfaces the code is platform specific. The only supported target is 64-bit GNU systems and is obviously the default when running make. But it should also compile on 32-bit platforms in a straightforward manner. As a NON-SUPPORTED target, it can be cross-compiled for windows-amd64. It is not supported. The library supports full OpenCL 2.1 and two OpenGL related extensions. COMPILING --------- Standard GNU development tools and a Java JDK are required in order to compile zcl (ant is not used). perl and cproto are also required. Java 13 is the baseline used but JDK 9+ may work. The prequisitve project notzed.nativez must also have previously been compiled and will be automatically used if it is present in the directory above this one (../nativez/). Copy config.make.in to config.make and edit for configuration parameters. The jni source includes the KHR OpenCL headers so an OpenCL SDK is not required but one may be used instead by defining CL_HOME on the command line. Build everything for default target of linux-amd64: $ make Build everything for a specific target: $ make TARGET= All intermediate and final results are place in `bin/'. `bin//lib/.jar' Modular jar for ide. `bin//lib/*.so' Platform specific `bin//bin/*.dll' runtime libraries. `bin//jmods/.jmod' Target specific .jmod. $ make bin Build everything but the jars and jmods. $ make dist Create a source archive. The source is found via $(find) so this will include any droppings. OPENCL VERSION -------------- For the most part it is up to applications to determine the OpenCL version in use and call functions appropriately. As of version 0.6 runtime version checking has been implemented for most api calls for version 2.0 or later. The version check is relatively cheap but has not been implemented on all versioned interfaces for performance reasons (i.e. CLKernel). The binding should be compilable against OpenCL headers from 1.0 or later, although as the headers are included this is not required (or tested). Linking - - - - As of 0.5 the library only uses dynamic linking. A function table is loaded at init time and all methods are invoked via the table. A script is used to generate the function table from CL/cl.h. It assumes a very specific formatting. Although it is not a reliable indicator of platform support, NULL function pointers are protected against. Currently the OpenCL version the SDK the library is compiled against sets the highest supported API level and by default it uses the bundled headers which is at OpenCL 2.1. USING ----- If not using .jmod, the directory containing libzcl.so must be added to the java.library.path or LD_LIBRARY_PATH during execution. The same goes for libnativez.so. The API follows the C api closely together with some convenient property getters and some overloaded functions for native arrays. Setting the environment variable ZCL_SYNC to non-zero will print all kernel executions and force a clFlush() and clFinish() at each. clinfo ------ A simple and incomplete implementation of the 'clinfo' tool is part of the package. After building it can be used as a basic functionality test. $ LD_LIBRARY_PATH=../nativez/bin/notzed.nativez/linux-amd64/lib:bin/notzed.zcl/linux-amd64/lib \ java --module-path bin/modules:../nativez/bin/modules \ -m notzed.zcl.demo/au.notzed.zcl.tools.clinfo Note that clinfo will crash on coprthr 1.6.0 due to a bug. STATUS ------ As of 0.5 this should be considered beta quality. It is in use for prototype development in a research context. It implements every base OpenCL 2.1 api, even some that aren't very useful for Java. All deprecated functions and constants from OpenCL 2.1 are (should be) correctly marked as such. Most interfaces are protected from null pointers, bad ranges, and runtime failures. This will protect against many common api errors with Java exceptions but it is still (easily) possible to crash applications with no obvious cause. Two KHR extensions have been (mostly) implemented: GLSharing and GLEvent. They reside in the au.notzed.zcl.khr namespace so have dropped some mess from the names. Beyond property getters there are few 'helper' functions so that methods take all the parameters of their C counterparts. This is not always terribly convenient. Documentation is limited. None of OpenCL 2.1 has been tested and may be broken. Automatic Resource Reclamation - - - - - - - - - - - - - - - - All objects should be automatically garbage collected. This is mostly automagic but in some cases requires the user to maintain references to parent objects explicitly. Objects may also be manually disposed of using release(). It is not recommended that retain() be used from Java at all. This automatic resource reclaimation comes with the same usual caveats with native resources - the impedence mismatch between Java and native can cause allocation problems. Somebody Else's Problem (it seems to work quite well in practice). Primitive arrays buffers - - - - - - - - - - - - - There is support for passing primitive arrays to functions which would otherwise only take a direct ByteBuffer. This (currently) forces any enqueue operations to blocking mode and also bloats out the api and implementation - so the limited convenience they provide might not be worth their cost. Note that this only affects 'asynchronous memory' and does not affect small arrays used for things like dimensions, ranges, or patterns which are copied by the jni and OpenCL API at method invocation time. Lambdas - - - - A design-in-progress lambda based mechanism for encapsulating enqueuable tasks is defined by CLTask. Some operations have been added to the target objects such as ofRead(byte[]) to the CLBuffer interface. The primary goal of this interface is to provide a cleaner api by removing the need to pass a pair of CLEventLists and a CLCommandQueue to every work function. As a side-effect of garbage collection and features of lambdas it provides extra convenience as well. For example, rather than calling this every time: q.enqueueReadBuffer(src, false, 0, size, dst, 0, null, null); q.enqueueReadBuffer(src, false, 0, size, dst, 0, waits, events); One can instead do this: // this once CLTask task = src.ofRead(dst, size); // multiple times q.offer(task); q.offer(task, waits, events); And so on, with a few other little lambda treats along the way like 'andThen()'. Although this mechanism allows for flexible "pre-compiled" expressions the inability to change parameters are invocation time have proved to be somewhat limiting. This is still work in progress. Extensions ---------- OpenCL extensions and function entry points must be resolved via the CLPlatform. Extension interfaces are handled by a two-layer mechanism. Firstly a Java/JNI combination implements one specific extension and contains all constants and entry points for the extension. The java object must derive from CLExtension. These may be used as is but are inconvenient. So relevent entry points are then added to each target object (i.e. first parameter of function) which handles the extension lookup. Each object which may have extented function on it must derive from CLExtendable which provides an efficient way to cache and resolve the extension object. Core extension objects are instantiated via CLPlatform but may be created by independent JNI code and used without modifying the core api. They are currently never unloaded. There may be some changes but the current design seems sufficient. FUTURE PLANS ------------ I'm not sure now, I haven't been using it much and there's zero outside interest. CUDA is all the fucked-up-rage these days. I might look into Vulkan, which has significant overlap although not such a simple API. LICENSE ------- Most of the code is licensed under GNU General Public License version 3, see COPYING for full details. Copyright (C) 2014,2015 Michael Zucchi This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Certain content was copied from cl.h from OpenCL 1.2, these files include the following header: Copyright (c) 2008 - 2012 The Khronos Group Inc. Permission is hereby granted, free of charge, to any person obtaining a copy of this software and/or associated documentation files (the "Materials"), to deal in the Materials without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Materials, and to permit persons to whom the Materials are furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Materials. THE MATERIALS ARE PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE MATERIALS OR THE USE OR OTHER DEALINGS IN THE MATERIALS.