Author Archives: Tianyu "biubiuty" Liu

Unreal Engine 4 memo: How to create god ray / light shaft effect for trees using the volumetric fog?

To create the god ray / light shaft effect through trees, we should use the volumetric fog feature recently added to Unreal Engine. Google search will give us the following 3 references:

The thing is even after perusing the first 2 references you may still have a hard time implementing the desired effect. Here is a summary of the right steps that you need to follow to avoid hours of trial-and-errors:

  • Add the tree, directional light, exponential height fog to the map.
  • For the exponential height fog, check the option volumetric fog; to make the god rays more obvious, increase fog density; optionally, increase fog height falloff so that the fog is concentrated at lower altitude.
  • For the directional light, to make the god rays more obvious, increase the intensity, for example to 20 lux; to create light shafts between leaves, check light shaft bloom.
  • Critical: Ensure that the mobility property of both the tree and directional light is set to movable. Without this step, the tree would have wrong, static shadow, and the directional light would not be able to create the god rays in the fog.

The above steps end up applying volumetric fog effect on a global scale. Optionally we may follow the great presentation (starting at 14:24) to implement the local volumetric fog effect via the particle system.

Unreal Engine 4 memo: How to perform rotator interpolation correctly?

In order to let an actor transition from one rotator to another smoothly, function FMath::RInterpConstantTo() should be used. The official documentation on this function is too brief to be helpful. This article aims to clear up the confusion.

Add member function and variable declaration

First off, in the actor class declaration, add the following functions and variables. PerformRotationInterpWithDelay(Delay) is the public interface to be called outside the class. Its purpose is to halt for a duration of Delay, then let the actor rotate smoothly until the target rotator is reached, which in our example is simply FRotator::ZeroRotator.

UCLASS()
class QL_API AMyActor : public AActor
{
    GENERATED_BODY()

public:

    ...

    //------------------------------------------------------------
    // After Delay seconds, perform PerformRotationInterpCallback()
    // which sets bStartRotationInterp to true, and performs rotation interpolation
    // in Tick() until the rotation becomes FRotator::ZeroRotator
    //------------------------------------------------------------
    UFUNCTION(BlueprintCallable, Category = "C++Function")
    void PerformRotationInterpWithDelay(const float Delay);

protected:

    ...

    //------------------------------------------------------------
    //------------------------------------------------------------
    UFUNCTION()
    void PerformRotationInterpCallback();

    //------------------------------------------------------------
    //------------------------------------------------------------
    bool bStartRotationInterp;
}

Define the functions

Implementation detail of PerformRotationInterpWithDelay(Delay) is shown below. After Delay, function PerformRotationInterpCallback() is called, which simply sets the flag bStartRotationInterp, making the actor rotate in the next Tick() call.

For the static function FMath::RInterpConstantTo(), the first argument is the current actor rotator GetActorRotation(), not the initial actor rotator. The second argument is the target actor rotator, which is set to FRotator::ZeroRotator in this example. The third argument is the duration in second between two frames DeltaTime. The last argument is the interpolation speed not adequately explained in the documentation. Here is what it actually represents: For each call of FMath::RInterpConstantTo(), the pitch, yaw, roll of the actor are uniformly incremented by a value of k. k is first set to the difference between target rotator and current rotator, but then clamped by [-m, m], where m is equal to DeltaTime times the interpolation speed. An interpolation speed of S indicates a change in pitch, yaw, roll by S degree per second. FMath::RInterpConstantTo() returns the new rotator to be applied to the actor per Tick(). To take into account the finite precision of floating point arithmetic, currentRotation.Equals(targetRotation) is subsequently performed. If it evaluates to true, the flag bStartRotationInterp is unset, and the rotation stops.

//------------------------------------------------------------
//------------------------------------------------------------
void AMyActor::PerformRotationInterpWithDelay(const float Delay)
{
    GetWorldTimerManager().SetTimer(StartRotationDelayTimerHandle,
    this,
    &AMyActor::PerformRotationInterpCallback,
    1.0f, // time interval in second
    false, // loop
    Delay); // delay in second
}

//------------------------------------------------------------
//------------------------------------------------------------
void AMyActor::PerformRotationInterpCallback()
{
    bStartRotationInterp = true;
}

//------------------------------------------------------------
//------------------------------------------------------------
void AMyActor::Tick(float DeltaTime)
{
    Super::Tick(DeltaTime);

    ...

    // interp rotation
    if (bStartRotationInterp)
    {
        FRotator NewRotation = FMath::RInterpConstantTo(GetActorRotation(), FRotator::ZeroRotator, DeltaTime, 100.0f);
        SetActorRotation(NewRotation);

        if (GetActorRotation().Equals(FRotator::ZeroRotator))
        {
            bStartRotationInterp = false;
        }
    }
}

Porting code from Nvidia GPU to AMD : lesson learned

In the past several weeks I have been porting a codebase from Nvidia CUDA platform to AMD HIP. Several critical issues were encountered, some solved, some attributed to compiler bugs, some remaining unfathomable to me.

There are 3 important things I have learned so far from the painstaking debugging process.

  • An unsigned integer with n bits only allows 0~(n-1) times bitwise left shift (<<). Excess shifts lead to undefined behavior. For Nvidia platform, 0 bit will be added, whereas for AMD, 1 bit will be added!!!
  • Currently there is a serious compiler bug: the wavefront vote function __any(pred) , which is supposed to work like __any_sync(__activemask(), pred) in CUDA, yields incorrect result in divergent threads!!!
  • This is very easy to miss: the parameter of wavefront vote functions __any(pred), __all(pred), etc is a 32-bit integer for both Nvidia and AMD platforms. If, however, a 64-bit integer is passed to the function, higher bits will be truncated!!! The solution is to explicitly cast the 64-bit integer to bool, which is then implicitly cast to int.

FindNVML.cmake done correctly — how to have CMake find Nvidia Management Library (NVML) on Windows and Linux

[Last updated on Feb 26, 2020]

The latest cmake 3.17 has just started to officially support a new module FindCUDAToolkit where NVML library is conveniently referenced by the target CUDA::nvml. With this new feature this article is now deprecated.

Nvidia Management Library (NVML) is a powerful API to get and set GPU states. Currently there is a lack of official CMake support. The first couple of google search results point to a script on github, which unfortunately is only partially correct and does not work on Windows. Here we provide a working solution, tested on Scientific Linux 6 and Windows 10, with CUDA 9.1 and CMake 3.11.

The NVML API is spread across several locations:

  • Linux
    • Header: ${CUDA_INCLUDE_DIRS}/nvml.h
    • Shared library: ${CUDA_TOOLKIT_ROOT_DIR}/lib64/stubs/libnvidia-ml.so
  • Windows
    • Header: ${CUDA_INCLUDE_DIRS}/nvml.h
    • Shared library: C:/Program Files/NVIDIA Corporation/NVSMI/nvml.dll
    • Import library: ${CUDA_TOOLKIT_ROOT_DIR}/lib/x64/nvml.lib

It is critical to note that, on Windows a dynamic library (.dll) is accompanied by an import library (.lib), which is different from a static library (also .lib). In CMake the target binary should link to the import library (.lib) directly instead of the .dll file. With that, the correct FindNVML.cmake script is shown in listing 1.

Listing 1

# FindNVML.cmake

if(${CUDA_VERSION_STRING} VERSION_LESS "9.1")
string(CONCAT ERROR_MSG "--> ARCHER: Current CUDA version "
${CUDA_VERSION_STRING}
" is too old. Must upgrade it to 9.1 or newer.")
message(FATAL_ERROR ${ERROR_MSG})
endif()

# windows, including both 32-bit and 64-bit
if(WIN32)
set(NVML_NAMES nvml)
set(NVML_LIB_DIR "${CUDA_TOOLKIT_ROOT_DIR}/lib/x64")
set(NVML_INCLUDE_DIR ${CUDA_INCLUDE_DIRS})

# .lib import library full path
find_file(NVML_LIB_PATH
NO_DEFAULT_PATH
NAMES nvml.lib
PATHS ${NVML_LIB_DIR})

# .dll full path
find_file(NVML_DLL_PATH
NO_DEFAULT_PATH
NAMES nvml.dll
PATHS "C:/Program Files/NVIDIA Corporation/NVSMI")
# linux
elseif(UNIX AND NOT APPLE)
set(NVML_NAMES nvidia-ml)
set(NVML_LIB_DIR "${CUDA_TOOLKIT_ROOT_DIR}/lib64/stubs")
set(NVML_INCLUDE_DIR ${CUDA_INCLUDE_DIRS})

find_library(NVML_LIB_PATH
NO_DEFAULT_PATH
NAMES ${NVML_NAMES}
PATHS ${NVML_LIB_DIR})
else()
message(FATAL_ERROR "Unsupported platform.")
endif()

find_path(NVML_INCLUDE_PATH
NO_DEFAULT_PATH
NAMES nvml.h
PATHS ${NVML_INCLUDE_DIR})

include(FindPackageHandleStandardArgs)
find_package_handle_standard_args(NVML DEFAULT_MSG NVML_LIB_PATH NVML_INCLUDE_PATH)

Once find_package(NVML) is called in user CMake code, two cache variables are generated: NVML_LIB_PATH and NVML_INCLUDE_PATH. For Windows, there is an additional NVML_DLL_PATH.

Lo Wang’s fortune cookie quotes

    • You are not illiterate.
    • Confucius say it is easy to hate and difficult to love. Frankie say relax.
    • Thank you Lo Wang! But your fortune is in another cookie!
    • What’s a seven-letter word for ‘cryptic’?
    • Live each day like it’s your last. Or at least today, because… Oh I don’t want to spoil it.
    • Light travels faster than sound. That’s why some people look brilliant, until you hear them speak.
    • Whoever coined the phrase ‘quiet as a mouse’ never stepped on one.
    • Help! I am being held prisoner in a video game factory.
    • Don’t sweat the petty things and don’t pet the sweaty things. – George Carlin
    • You’re never too old to learn something stupid.
    • All men eat, but Fu Man Chu.
    • Some mistakes are too fun to make only once.
    • With sufficient thrust, pigs fly just fine.
    • He who takes advice from a cookie is sure to crumble.
    • You will stop procrastinating. Later.
    • Cardboard belt is a waist of paper.
    • The difference between an oral thermometer and a rectal thermometer is all a matter of taste.
    • You don’t need a parachute to skydive. You need a parachute to skydive twice.
    • Laugh and the world laughs with you. Cry and the world laughs at you.
    • That’s what ki said.
    • The good news: you’re not paranoid. The bad news: everyone is actually trying to kill you.
    • The early bird gets the worm. The second mouse gets cheese.
    • Small cookies bring great joy.
    • Time is an illusion. Lunchtime doubly so.
    • To be is to do. – Socrates
      To do is to be. – Sarte
      Do be do be do. – Sinatra
    • It is better to have loved and lost than to have had loved and gotten syphilis.
    • Cookie monster wasn’t here.
    • Chew, or chew not. There is no pie.
    • To maintain perfect accuracy, shoot first and call whatever you hit the target.
    • Information is not knowledge. Knowledge is not wisdom. Wisdom is not truth. Truth is not beauty. Beauty is not love. Love is not music. Music is the best. – FZ
    • Man who stand on toilet, high on pot.
    • It is better to have loved and lost than to have loved and got syphilis.

 

Reference: Shadow Warrior (2013)

The legend continues

qwc_top_banner

Day 1 is pretty disappointing. rapha lost to cooller without landing a single round; Team Liquid was stomped by 2z; Quakecon venue lost internet connection.

Day 2 is full of amazing matches. rapha managed to win two matches in group stage, and then survived Raisy scare. However, in the quarter final, he lost to his self-effacing teammate dahang. I root for rapha, and hope he will come back stronger next year. cooller continued his invincible state, until surprisingly thwarted by the young talent clawz in the semi final.

Final day, clawz is remembered forever in quake history  as the first player to win the championship of both duel and sacrifice. Everybody was awed, mouth agape. Also being remembered is clawz’s extreme bad manner towards Vo0 and the way he talked in his interview. Guess the kid has a lot to learn.

Prey 2017

A pretty decent first person shooter of 2017. The combat is very challenging and requires appropriate strategies. Some of my tips:

  • In case of a sudden enemy encounter, press mouse wheel to go to the favorite wheel menu that pauses the game and gives you time to select the most suitable weapon or ability.
  • To fight mimic and regular phantom: use gloo gun to freeze them, use psychoscope to scan them, hit with pistol or kinetic blast.
  • In psychotronics there is a morgue with a locked door. An easy way to enter the room is: break the window, mimic a small object, and roll in.
  • After completing the side quest Psychic Water, drinking water from the fountains in office rooms increases a large amount of psi and a small amount of hp. Don’t forget to take advantage of this bonus.
  • The first half of the game is especially hard. Use a combination of silenced pistol, gloo gun, combat focus and kinetic blast.
  • The second half of the game is much easier. (Ab)use mindjack,  machine mind in tandem with psychoshock.
  • In a side request you can choose whether to kill Dahl or incapacitate him (removing his last neuromod). Use the stun gun to do the latter.
  • Enemies are regenerated each time the player reenters a map that has been cleared before. Do not try to eliminate all enemies as it will deplete the ammunition or psi points very quickly. This is especially true for the second half of the game. Instead, mind control them temporarily and run away.

I would rate this game 7/10.

  • Pro:
    • Careful thinking and combat strategies are required, which makes this game stand out.
    • Plenty of good ideas implemented by the development teams that allow the players to play in their own way. For instance, mimic matter ability and gloo gun.
    • A nod to Bioshock/system shock. They use my favorite font, Century Gothic, the same with Bioshock.
  • Con:
    • Graphics glitches such as flicking shadow, low quality texture in mid distance. In certain areas such as in the reactor room, the fps may drop abruptly to 20 or lower. The graphics is not very impressive even under the “very high” setting.
    • Hacking (particularly level 2) is a painful experience for PC gamers that use a regular keyboard. The control is simply bad.
    • The NPCs are not very well made. They lack facial expression. They are not good looking. The texture and shading is rather coarse.
    • Similar to Bioshock Infinite, when the player is sprinting, the mouse sensitivity is reduced by design, which I hate very much.
    • The ending is rather anticlimactic. I don’t find it satisfying. This is also similar to Bioshock Infinite, which I personally think is an overrated game.

 

Witcher 3 tips

  • My favorite alchemy abilities
    • acquired tolerance
    • heightened tolerance
    • poisoned blades
  • My favorite decoction
    • ekhidna: actions consuming stamina increases vitality.
    • ancient leshen: sign cast increases stamina regeneration.
    • foglet: increase sign intensity in cloudy weather.
    • ekimmara: damages to foes increases vitality.
  • My favorite potion
    • tawny:  increase stamina regeneration speed. superior version never expires at night!
    • thunderbold: increase attack power.
    • golen oriole: immune to poison.
  • My favorite bomb:
    • devil’s puffball: releases poisonous gas. When used together with golen oriole, fight against a group of human enemies will become a piece of cake.
  • Other important tips
    • Burning and poison effect damages enemies by an amount proportional to their health.
    • Some monsters, such as werewolf, can regenerate vitality with great amount and fast pace. A green number rising above them indicates the recovered health. If further attack is found in vain, the player should not take this quest on the current level even if it is close to the suggested one. Simply level up and make the slash more deadly.

Build Geant4 (including OpenGL visualization) using Cygwin on Windows

Build Geant4 (including OpenGL visualization) using Cygwin on Windows

[Updated on Nov 15, 2019]

There has been a lack of official documentation on how to build Geant4 using Cygwin on Windows. This post is intended to fill the gap. All we need to do is to install several Cygwin packages, modify a couple of cmake scripts, and change a few lines of Geant4 source code.

Test conditions

  • Geant4 10.5.1
  • Windows 10 (64-bit)
  • Cygwin 64 version 3.0.7
  • gcc and g++ version 7.4.0
  • cmake version 3.14.5

The following Cygwin packages are required to build C++ project under CMake.

  • gcc-g++
  • cmake
  • make

The following Cygwin packages are required to build Geant4 core engine.

  • expat, libexpat-devel
  • zlib, zlib-devel

The following Cygwin packages are required to build Geant4 OpenGL visualization module.

  • libX11-devel
  • libXmu-devel
  • libGL-devel
  • xinit
  • xorg-server
  • xorg-x11-fonts-*

Steps

  • Modify cmake scripts.
    • In both cmake\Modules\G4BuildSettings.cmake and cmake\Modules\G4ConfigureCMakeHelpers.cmake, change CMAKE_CXX_EXTENSIONS from OFF to ON, i.e.
      set(CMAKE_CXX_EXTENSIONS ON)
      

      The above steps are crucial in that they let the compiler flag -std=gnu++11 be automatically added in place of the initial flag -std=c++11. On Cygwin -std=c++11 will make the Posix function posix_memalign() inaccessible, which will cause Geant4 compile errors.

  • Modify source code.
    • In source\processes\electromagnetic\dna\utils\include\G4MoleculeGun.hh, add declaration of explicit specialization immediately after class definition:
      template<typename TYPE>
      class TG4MoleculeShoot : public G4MoleculeShoot
      {
      public:
          TG4MoleculeShoot() : G4MoleculeShoot(){;}
          virtual ~TG4MoleculeShoot(){;}
          void Shoot(G4MoleculeGun*){}
      
      protected:
          void ShootAtRandomPosition(G4MoleculeGun*){}
          void ShootAtFixedPosition(G4MoleculeGun*){}
      };
      
      // Above is class definition in Geant4
      // We need to add three lines of code here
      // to declare explicit specialization
      
      template<> void TG4MoleculeShoot<G4Track>::ShootAtRandomPosition(G4MoleculeGun* gun);
      template<> void TG4MoleculeShoot<G4Track>::ShootAtFixedPosition(G4MoleculeGun* gun);
      template<> void TG4MoleculeShoot<G4Track>::Shoot(G4MoleculeGun* gun);
      

      Otherwise the compiler would complain about multiple definition.

    • In source\global\management\src\G4Threading.cc, comment out syscall.h include. Apparently Cygwin does not offer the OS specific header file syscall.h, and thus do not support multithreading in Geant4 that relies on syscall.h.
      // #include  // comment out this line
      
  • Create out-of-source build script. Due to lack of syscall.h in Cygwin, only single-threaded Geant4 can be built.
    • Release build
      cmake ../geant4_src -DCMAKE_C_COMPILER=/usr/bin/gcc.exe \
      -DCMAKE_CXX_COMPILER=/usr/bin/g++.exe \
      -DCMAKE_INSTALL_PREFIX=/opt/geant4/release \
      -DCMAKE_BUILD_TYPE=Release \
      -DGEANT4_USE_SYSTEM_EXPAT=ON \
      -DGEANT4_USE_SYSTEM_ZLIB=ON \
      -DGEANT4_INSTALL_DATA=ON \
      -DGEANT4_USE_OPENGL_X11=ON
      
  • Build.
    make
    

    For faster compile, use make -j6 which uses 6 parallel processes.

  • Install.
    make install
    
  • Visualization
    To visualize B1 example, in one Cygwin terminal:

    startxwin
    

    In another terminal:

    export DISPLAY=:0.0
    ./exampleB1.exe
    
  • Have fun with Geant4 !!! … and remember: If you love something, set it free.

Acknowledgement

Thanks to Charlie for making me aware of the issues in newer Geant4.