aboutsummaryrefslogtreecommitdiffhomepage
path: root/video/out/gl_video_shaders.glsl
Commit message (Collapse)AuthorAge
* vo_opengl: refactor shader generation (part 1)Gravatar wm42015-03-12
| | | | | | | | | | | | | | | | | | | The basic idea is to use dynamically generated shaders instead of a single monolithic file + a ton of ifdefs. Instead of having to setup every aspect of it separately (like compiling shaders, setting uniforms, perfoming the actual rendering steps, the GLSL parts), we generate the GLSL on the fly, and perform the rendering at the same time. The GLSL is regenerated every frame, but the actual compiled OpenGL-level shaders are cached, which makes it fast again. Almost all logic can be in a single place. The new code is significantly more flexible, which allows us to improve the code clarity, performance and add more features easily. This commit is incomplete. It drops almost all previous code, and readds only the most important things (some of them actually buggy). The next commit will complete it - it's separate to preserve authorship information.
* vo_opengl: implement antiringing for tensor scalersGravatar Niklas Haas2015-02-27
| | | | | | | | | | | | This is based on pretty much the same (somewhat naive) logic right now. I'm not convinced that the extra logic that eg. madVR includes is worth enough to warrant heavily confusing the logic for it. This shouldn't slow down the logic at all in any sane shader compiler, and indeed it doesn't on any shader compiler that I tested. Note that this currently doesn't affect cscale at all, due to the weird implementation details of that.
* vo_opengl: greatly increase smoothmotion performanceGravatar Niklas Haas2015-02-24
| | | | | | | | | | | | | | | | | | | | Instead of rendering and upscaling each video frame on every vsync, this version of the algorithm only draws them once and caches the result, so the only operation that has to run on every vsync is a cheap linear interpolation, plus CMS/dithering. On my machine, this is a huge speedup for 24 Hz content (on a 60 Hz monitor), up to 120% faster. (The speedup is not quite 250% because of the overhead that the larger FBOs and CMS provides) In terms of the implementation, this commit basically swaps interpolation and upscaling - upscaling is moved to inter_program, and interpolation is moved to the final_program. Furthermore, the main bulk of the frame rendering logic (upscaling etc.) was moved to a separete function, which is called from gl_video_interpolate_frame only if it's actually necessarily, and skipped otherwise.
* vo_opengl: extend ifdef against shader arraysGravatar wm42015-02-24
| | | | | | | GLES2 shaders do not have line continuation characters. Abuse the HAVE_ARRAYS define to exclude code which uses arrays, and which also happens to cover all code that defines multi-line macros. (So yes, this is a hack.)
* vo_opengl: explicitly check potential candidates for polar filtersGravatar Niklas Haas2015-02-24
| | | | | | | This adds a small check for candidates that could potentially be inside the radius, but aren't necessarily. This speeds up performance by a negligible amount on my hardware, but it's mainly a prerequisite for a further change (using a larger true radius).
* vo_opengl: glsl: remove trailing \Gravatar wm42015-02-16
| | | | | This should be no problem... but it _might_ help with #1536, so it's worth a try.
* vo_opengl: add support for linear scaling without CMSGravatar Niklas Haas2015-02-06
| | | | | | | | | | This introduces a new option linear-scaling, which is now implied by srgb, icc-profile and sigmoid-upscaling. Notably, this means (sigmoidized) linear upscaling is now enabled by default in opengl-hq mode. The impact should be negligible, and there has been no observation of negative side effects of sigmoidized scaling, so it feels safe to do so.
* vo_opengl: change initialization of gamma optionGravatar wm42015-02-03
| | | | | | | | | | Make the lazy gamma initialization less weird, and make the default value of the "gamma" sub-option 1.0. This means --vo=opengl:help will list the actual default value. Also change the lower bound to 0.1 - avoids a division by zero (I don't know how shaders handle NaN, but it's probably not a good idea to give them this value).
* csputils, vo_opengl: remove per-component gammaGravatar wm42015-02-03
| | | | | | | | | There was some code accounting for different gamma values for R/G/B. It's inherited from an old, undocumented MPlayer feature, which was at some point disabled for convenience by myself (meaning you couldn't actually set separate gamma because it was removed from the property interface - mp_csp_copy_equalizer_values() just set them to the same value). Get rid of these meaningless leftovers.
* vo_opengl: always clamp the video to range 0-1Gravatar Niklas Haas2015-02-03
| | | | | | Before this, enabling :gamma in combination with :sigmoid and probably a few other things results in ugly artifacts because the video isn't clamped until after the :gamma was applied (or at all, if the cms_matrix is unused).
* vo_opengl: fix shader issue with Intel driversGravatar wm42015-01-29
| | | | | | | | | Windows Intel drivers seem to reject some (AFAIK) valid GLSL. Make them happy. <rossy> GL_RENDERER='Intel(R) HD Graphics 4400' <rossy> GL_VERSION='3.0.0 - Build 10.18.14.4080' <rossy> GL_SHADING_LANGUAGE_VERSION='1.30 - Build 10.18.14.4080'
* vo_opengl: change the way unaligned chroma size is handledGravatar wm42015-01-27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | This deals with subsampled YUV video that has odd sizes, for example a 5x5 image with 4:2:0 subsampling. It would be easy to handle if we actually passed separate texture coordinates for each plane to the shader, but as of now the luma coordinates are implicitly rescaled to chroma one. If luma and chroma sizes don't match up, and this is not handled, you'd get a chroma shift by 1 pixel. The existing hack worked, but broke separable scaling. This was exposed by a recent commit which switched to GL_NEAREST sampling for FBOs. The rendering was accidentally scaled by 1 pixel, because the FBO size used the original video size, while textures_sizes[0] was set to the padded texture size (i.e. one pixel larger). It could be fixed by setting the padded texture size only on the first shader. But somehow that is annoying, so do something else. Don't pad textures anymore, and rescale the chroma coordinates in the shader instead. Seems like this somehow doesn't work with rectangle textures (and introduces a chroma shift), but since it's only used when doing VDA hardware decoding, and the bug occurs only with unaligned video sizes, I don't care much. Fixes #1523.
* vo_opengl: add smoothmotion frame blendingGravatar Stefano Pigozzi2015-01-23
| | | | | | | | | | | | | | | | | | | SmoothMotion is a way to time and blend frames made popular by MadVR. It's intended behaviour is to remove stuttering caused by mismatches between the display refresh rate and the video fps, while preserving the video's original artistic qualities (no soap opera effect). It's supposed to make 24fps video playback on 60hz monitors as close as possible to a 24hz monitor. Instead of drawing a frame once once it's pts has passed the vsync time, we redraw at the display refresh rate, and if we detect the vsync is between two frames we interpolated them (depending on their position relative to the vsync). We actually interpolate as few frames as possible to avoid a blur effect as much as possible. For example, if we were to play back a 1fps video on a 60hz monitor, we would blend at most on 1 vsync for each frame (while the other 59 vsyncs would be rendered as is). Frame interpolation is always done before scaling and in linear light when possible (an ICC profile is used, or :srgb is used).
* vo_opengl: rename all scale options to make more senseGravatar Niklas Haas2015-01-22
| | | | | This emphasizes the fact that scale is used for *all* image upscaling, with cscale only serving a minor role for subsampled material.
* vo_opengl: switch to nearest neighbour for trivial resamplingGravatar Niklas Haas2015-01-22
| | | | | | | | | | This is significantly faster for FBOs on most modern GPUs, although it did not result in a huge difference for the video source texture on the sizes I tested. It might be more significant for 1080p or 4K content, so it's worth revisiting this in the future. It also renames SAMPLE_BILINEAR to SAMPLE_TRIVIAL to match the semantics.
* vo_opengl: implement naive anti-ringingGravatar Niklas Haas2015-01-22
| | | | | | | | This is not quite the same thing as madVR's antiringing algorithm, but it essentially does something similar. Porting madVR's approach to elliptic coordinates will take some amount of thought.
* vo_opengl: unroll ewa_lanczos to avoid looping and unnecessary samplesGravatar Niklas Haas2015-01-22
| | | | | | | This speeds up performance by a factor of something like 10%, since it omits unnecessary checks. This will also make adding anti-ringing easier.
* vo_opengl: clean up ewa_lanczos codeGravatar Niklas Haas2015-01-22
| | | | | | This fixes compatibility with GLES 2.0 and makes the code a bit neater in general. It also properly forces indirect scaling for subsampled video regardless of the lscale setting.
* vo_opengl: handle grayscale input better, add YA16 supportGravatar wm42015-01-21
| | | | | | | | | | Simply clamp off the U/V components in the colormatrix, instead of doing something special in the shader. Also, since YA8/YA16 gave a plane_bits value of 16/32, and a colormatrix calculation overflowed with 32, add a component_bits field to the image format descriptor, which for YA8/YA16 returns 8/16 (the wrong value had no bad consequences otherwise).
* vo_opengl: remove 1D texture usageGravatar wm42015-01-18
| | | | | | | Broke operation with GLSL. Since 1D texture usage was apparently (and mysteriously) good for speed, it might be added back, but it's unknown how to do so in a clean way.
* vo_opengl: get rid of approx-gamma and make it the default as per BT.1886Gravatar Niklas Haas2015-01-16
| | | | | | | | | | | | | | | | | | | | | | | | | | After finding out more about how video mastering is done in the real world it dawned upon me why the "hack" we figured out in #534 looks so much better. Since mastering studios have historically been using only CRTs, the practice adopted for backwards compatibility was to simulate CRT responses even on modern digital monitors, a practice so ubiquitous that the ITU-R formalized it in R-Rec BT.1886 to be precisely gamma 2.40. As such, we finally have enough proof to get rid of the option altogether and just always do that. The value 1.961 is a rounded version of my experimentally obtained approximation of the BT.709 curve, which resulted in a value of around 1.9610336. This is the closest average match to the source brightness while preserving the nonlinear response of the BT.1886 ideal monitor. For playback in dark environments, it's expected that the gamma shift should be reproduced by a user controlled setting, up to a maximum of 1.224 (2.4/1.961) for a pitch black environment. More information: https://developer.apple.com/library/mac/technotes/tn2257/_index.html
* vo_opengl: add ewa_lanczos upscaler (aka jinc)Gravatar Niklas Haas2015-01-15
| | | | | This is the polar (elliptic weighted average) version of lanczos. This introduces a general new form of polar filters.
* video: Add sigmoidal upscaling to avoid ringing artifactsGravatar Niklas Haas2015-01-09
| | | | | | | | | This avoids issues when upscaling directly in linear light, and is the recommended way to upscale images according to imagemagick. The default slope of 6.5 offers a reasonable compromise between ringing artifacts eliminated and ringing artifacts introduced by sigmoid-upscaling. Same goes for the default center of 0.75.
* video: Remove some stale CMS code, minor cosmeticsGravatar Niklas Haas2015-01-07
| | | | This removes an old code path that was disabled in 016bb14.
* vo_opengl: remove obsolete comment in shaderGravatar wm42015-01-04
|
* vo_opengl: improve fallback handling with GLESGravatar wm42014-12-21
| | | | | | | | | Whether we have texture_rg doesn't matter much anymore; the scaler should be fine with this. But on ES 2.0, 1st class arrays are missing, so even if filterable float textures should be available, it won't work. Dithering (at least the "fruit" variant) will not work either, because it uses floats.
* vo_opengl: GLES does not support GL_BGRAGravatar wm42014-12-20
| | | | | | | | | | | | | Apparently GLES 2 and 3 do not support this. (The implementations I tested with were derived from desktop OpenGL and were not overly strict with this.) This is no problem; just use GL_RGBA and mangle the channels in the shader. Also disable direct support for image formats like IMGFMT_RGB555 with GLES; at least some of them are not supported in this form, and the formats aren't important anyway.
* vo_opengl: add GLES 2 supportGravatar wm42014-12-19
| | | | | | | | Rather basic support. Almost nothing works, and even if it does, it's bound to be inefficient (due to texture upload). This was tested with the nVidia desktop binary drivers, which provide GLES 2 support only. However, nVidia is not known to be very strict about OpenGL, and the driver is very new too, so the vo_opengl code will have bugs too.
* vo_opengl: do not use 4x3 matrixGravatar wm42014-12-18
| | | | | | | | | | | This was a nice trick to get the mpv colormatrix directly into OpenGL, because the memory representation happened to match. Unfortunately, OpenGL ES 2 doesn't have glUniformMatrix4x3fv(). Even more unfortunately, the memory representation is now incompatible. It would be nice to change it, but that would mean getting into a big mess.
* vo_opengl: simplify the case without texture_rgGravatar wm42014-12-18
| | | | | | | | | | | | If GL_RED was not available, we used GL_ALPHA. But this is an unnecessary complication, and it's easier to use GL_LUMINANCE instead. With the latter, a texture will return the .r component set, and as long as the shader doesn't look at the other components, the shader doesn't need any changes. Some of the changes added in 0e8fbdbd are now unneeeded. Also, realign the entire gl_byte_formats_legacy table.
* vo_opengl: GLES 3 supportGravatar wm42014-12-17
| | | | | | | | | | | | Tested with MESA on software emulation. Seems to work well, although the default FBO format in opengl-hq disables most interesting features. I have no idea how well it will work on real hardware (or if it does at all). Unfortunately, some features, including playback of 10 bit video, are not supported. Not sure what to do about this. GLES 2 or 1 do not work.
* vo_opengl: glsl: stricter typingGravatar wm42014-12-17
| | | | | | | | Older GLSL dialects as well as GLES3 do not support the following things in expressions: - implicit conversions of integer constants to float - arithmetic of float*vecN
* vo_opengl: remove requirement for RG texturesGravatar wm42014-12-16
| | | | | Features not supported are disabled (although with a misleading error message).
* vo_opengl: use all filter sizes possible with the shadersGravatar wm42014-12-08
| | | | | | | | | | | | | | Not all filter sizes the shaders could handle were in the filter_sizes list. The shader can handle any multiple of 4 (the sizes 2 and 6 are special-cased to keep it simple). Add all possible filter sizes, up to 64. 64 is ridiculously high anyway. Most of the larger filter sizes are completely useless for upscaling, but help with the fancy-downscaling option. (Although it would still be more efficient to use cascaded scalers to handle downscaling better.) I considered doing something less stupid than the hardcoded array, but it seems this is still the simplest solution.
* vo_opengl: refactor: instantiate scaler functions at runtimeGravatar wm42014-12-08
| | | | | | | | | | | | | | | | | Before this commit, the convolution scaler shader functions were pre- instantiated in the shader file. For every filter size, a corresponding function (with the filter size as suffix) had to be present. Change this, and make the C code emit the necessary bits. This means the shader code is much reduced. (Although hopefully it doesn't make shader compilation faster - it would require a really dumb compiler if it spends its time on dead code.) It also makes it more flexible, which is the main goal. The DEF_SCALER0 stuff is needed because the C code writes the header of the shader, at a point where scaler macros are not defined yet.
* vo_opengl: never use 1D textures for lookup texturesGravatar wm42014-12-08
| | | | | | | | | This was a microoptimization for small filters which need 4 or less weights per sample point. When I originally wrote this code, using a 1D texture seemed to give a slight speed gain, but now I couldn't measure any difference. Remove this to simplify the code.
* vo_opengl: refactor: merge convolution function and sampler entrypointGravatar wm42014-12-08
| | | | | | | | | There's not much of a reason to have the actual convolution code in a separate function. Merging them actually simplifies the code a bit, and gets rid of the repetitious macro invocations to define the functions for each filter size. There should be no changes in behavior or output.
* vo_opengl: extend filter size to 64Gravatar wm42014-12-06
| | | | | | For better downscaling. Maybe the list of filter sizes shouldn't be static...
* vo_opengl: extend filter size to 32Gravatar wm42014-12-06
| | | | | | Also replace the weights calculations for 8/12/16 with the generic weight function definition macro. (The weights 2/4/6 follow slightly different rules.)
* vo_opengl: Linearize non-RGB sRGB files correctly (eg. JPEG)Gravatar Niklas Haas2014-11-26
| | | | Signed-off-by: wm4 <wm4@nowhere>
* vo_opengl: Reword comment in shaderGravatar Niklas Haas2014-11-26
| | | | | I didn't quite understand this comment after looking at the code again months later, so I reworded it for better clarity.
* vo_opengl: draw OSD twice in 3D mode caseGravatar wm42014-10-29
| | | | | | | | | | | | | Apparently this is needed for correct 3D mode subtitles. In general, it seems you need to duplicate the whole "GUI", so it's done for all OSD elements. This doesn't handle the "duplication" of the mouse pointer. Instead, the mouse can be used for the top/left field only. Also, it's possible that we should "compress" the OSD in the direction it's duplicated, but I don't know about that. Fixes #1124, at least partially.
* vo_opengl: remove macro operator from shaderGravatar Bin Jin2014-08-29
| | | | Removes '##' operator from OpenGL shader code.
* vo_opengl: fix shaderGravatar wm42014-08-28
| | | | | | | | | | | | Regression since commit f14722a4. For some reason, this worked on nvidia, but rightfully failed on mesa. At least in C, the ## operator indeed needs two macro arguments, and you can't just concatenate with non-arguments. This change will most likely fix it. CC: @bjin
* vo_opengl: add cparam1 and cparam2 optionsGravatar Bin Jin2014-08-26
| | | | | | Although cscale is rarely used, it's possible that params of cscale are accidentally set to lparam1 and lparam2, which might cause unexpected results.
* vo_opengl: Make approx-gamma affect OSD/subGravatar Niklas Haas2014-06-22
| | | | | | Close #837 Signed-off-by: wm4 <wm4@nowhere>
* video: Generate an accurate CMS matrix instead of hard-codingGravatar Niklas Haas2014-06-22
| | | | | | | | | This also avoids an extra matrix multiplication when using :srgb, making that path both more efficient and also eliminating more hard-coded values. In addition, the previously hard-coded XYZ to RGB matrix will be dynamically generated.
* video: Support BT.2020 constant luminance systemGravatar Niklas Haas2014-06-22
| | | | Signed-off-by: wm4 <wm4@nowhere>
* video: Add support for non-BT.709 primariesGravatar Niklas Haas2014-06-22
| | | | | | | This add support for reading primary information from lavc, categorized into BT.601-525, BT.601-625, BT.709 and BT.2020; and passes it on to the vo. In vo_opengl, we always generate the 3dlut against the wider BT.2020 and transform our source into this colorspace in the shader.
* video: Add BT.2020-NCL colorspace and transfer functionGravatar Niklas Haas2014-06-22
| | | | Source: http://www.itu.int/dms_pubrec/itu-r/rec/bt/R-REC-BT.2020-0-201208-I!!PDF-E.pdf