aboutsummaryrefslogtreecommitdiffhomepage
path: root/video/out/opengl
Commit message (Collapse)AuthorAge
...
* vo_opengl: don't pass negative height to overlay_adjust()Gravatar wm42016-09-16
| | | | | | Negative height is used to signal a flipped framebuffer. There's absolutely no reason to pass this down to overlay_adjust(), and only requires implementers to deal with an additional special-case.
* hwdec_cuda: Rename config variable to be more consistentGravatar Philip Langdale2016-09-16
| | | | | | 'cuda-gl' isn't right - you can turn this on without any GL and get some non-zero benefit (with the cuda-copy hwaccel). So 'cuda-hwaccel' seems more consistent with everything else.
* vo_opengl: rpi: cosmetic changeGravatar wm42016-09-15
| | | | I almost feel sorry wasting a commit on this.
* vo_opengl: fix OSD with icc-profile after previous commitGravatar wm42016-09-14
| | | | | | This happened to break because the texture unit wasn't reset to 0, which some code expects. The OSD code in particular set the OSD texture on the wrong texture unit, with the result that OSD/OSC was not visible.
* vo_opengl: dynamically manage texture unitsGravatar wm42016-09-14
| | | | | | | | | | | | | | | | | | | | | A minor cleanup that makes the code simpler, and guarantees that we cleanup the GL state properly at any point. We do this by reusing the uniform caching, and assigning each sampler uniform its own texture unit by incrementing a counter. This has various subtle consequences for the GL driver, which hopefully don't matter. For example, it will bind fewer textures at a time, but also rebind them more often. For some reason we keep TEXUNIT_VIDEO_NUM, because it limits the number of hook passes that can be bound at the same time. OSD rendering is an exception: we do many passes with the same shader, and rebinding the texture each pass. For now, this is handled in an unclean way, and we make the shader cache reserve texture unit 0 for the OSD texture. At a later point, we should allocate that one dynamically too, and just pass the texture unit to the OSD rendering code. Right now I feel like vo_rpi.c (may it rot in hell) is in the way.
* vo_opengl: require explicit reset on shader cache after renderingGravatar wm42016-09-14
| | | | | | | | | The caller now has to call gl_sc_reset(), and _after_ rendering. This way we can unset OpenGL state that was setup for rendering. This affects the shader program, for example. The next commit uses this to automatically manage texture units via the shader cache. vo_rpi.c changes untested.
* vo_opengl: remove a redundant glActiveTexture() callGravatar wm42016-09-14
| | | | | This bound video textures to individual texture units - this is how it used to work long ago, but now is pointless, and maybe even dangerous.
* vo_opengl: make the number of PBOs tunableGravatar Niklas Haas2016-09-14
| | | | | Also set the number of PBOs from 2 to 3, which should be better for pipelining. This makes it easier to add more in the future.
* vo_opengl: drm: use new EGL context creation codeGravatar wm42016-09-14
|
* vo_opengl: wayland: use new EGL context creation codeGravatar wm42016-09-14
|
* vo_opengl: EGL: dump some version infoGravatar wm42016-09-14
|
* vo_opengl: EGL: better desktop-GL context creationGravatar wm42016-09-14
| | | | | | | | | | | | Stops Mesa from restricting us to OpenGL 3.0. It also tries to create GLES 3 contexts for drivers which do not just return a higher context when requesting GLES 2. I don't know whether this code is a good or bad idea. A not-so-good aspect is that we don't check for EGL 1.5 (or 1.4 extensions) for some of the more advanced context attributes. But EGL implementations should be able to tolerate it and return an error, and then we'd use the fallback.
* vo_opengl: EGL: silence eglBindAPI() error messageGravatar wm42016-09-13
| | | | | It's not helpful and will be printed with EGL implementations that don't support OpenGL at all. Just shut it up.
* vo_rpi, vo_opengl: separate RPI/EGL-specific code for both VOsGravatar wm42016-09-13
| | | | | | | | | This used to be shared, but since vo_rpi is going to be removed, untangle them. There was barely any actual code shared since the recent changes anyway. As a subtle change, we also stop opening libGLESv2.so explicitly in the vo_opengl backend, and use RTLD_DEFAULT instead.
* vo_opengl: rpi: use new egl context creation helper functionGravatar wm42016-09-13
| | | | Only for the "new" vo_opengl backend code.
* vo_opengl: mali fbdev supportGravatar wm42016-09-13
| | | | | | | | | | | | Minimal support just for testing. Only the window surface creation (including size determination) is really platform specific, so this could be some generic thing with platform-specific support as some sort of sub-driver, but on the other hand I don't see much of a need for such a thing. While most of the fbdev usage is done by the EGL driver, using this fbdev ioctl is apparently the only way to get the display resolution.
* vo_opengl: factor some EGL context creation codeGravatar wm42016-09-13
| | | | | | | Add a function to egl_helpers.c for creating an EGL context and make context_x11egl.c use it. This is meant to be generic, and should work with other windowing APIs as well. The other EGL-using code in mpv can be switched to it.
* vo_opengl: fix typo in bt.601 auto-guessing logicGravatar Niklas Haas2016-09-13
| | | | | | | The wrong enum got copied here, so it was essentially using the transfer characteristics as the primaries (instead of the primaries), which accidentally worked fine most of the time (since the two usually coincided), but broke on weird/mistagged files.
* vo_opengl: fix non-C11 TLS fallback for gccGravatar wm42016-09-12
| | | | | | The consequence of this was that e.g. hardware decoding with VAAPI-EGL could sometimes not work if the compiler didn't support C11. (Although I found this one on RPI, which also uses this mechanism.)
* vo_opengl: better behavior in GL error corner casesGravatar wm42016-09-12
| | | | | | | | | | | If the shader fails to compile, and assertion could trigger in gl_sc_gen_shader_and_reset() due to the code trying to recreate the shader every time, and re-appending the uniforms every time. Just reset the uniform array to fix this. Some disturbed GL drivers might not return anything for glGetShaderiv() if the GL state got "lost", so initialize variables just for additional robustness.
* vo_opengl: rpi: merge vo_rpi featuresGravatar wm42016-09-12
| | | | | | | | | | | Since vo_rpi is going to be deprecated, better port its features to the vo_opengl backend. The most tricky part is the fact that recreating dispmanx elements will conflict with the GL context. Fortunately, RPI's EGL support is reasonably compliant, and we can transplant the context to newly created dispmanx elements, making this much easier. This means unlike vo_rpi, the GL state will actually not be recreated.
* vo_opengl: add hw overlay support and use it for RPIGravatar wm42016-09-12
| | | | | | | | | | | This overlay support specifically skips the OpenGL rendering chain, and uses GL rendering only for OSD/subtitles. This is for devices which don't have performant GL support. hwdec_rpi.c contains code ported from vo_rpi.c. vo_rpi.c is going to be deprecated. I left in the code for uploading sw surfaces (as it might be slightly more efficient for rendering sw decoded video), although it's dead code for now.
* hwdec_cuda: Implement download_image for screenshotsGravatar Philip Langdale2016-09-10
| | | | Data has to be copied to system memory for screenshots.
* hwdec_cuda: Use the non-deprecated CUDA-GL interop APIGravatar Philip Langdale2016-09-10
| | | | | | | | | | The nvidia examples use the old (as in CUDA 3.x) interop API which is deprecated, and I think not even functional on recent versions of CUDA for windows. As I was following the examples, I used this old API. So, let's update to the new API, and hopefully, it'll start working on windows too.
* vo_opengl: use dedicated image unref function in config caseGravatar wm42016-09-08
| | | | | | | Just another corner-caseish potential issue. Unlike unreffing the image manually, unref_current_image() also takes care of properly unmapping hwdec frames. (The corner-case part of this is that it's probably never mapped at this point, but it's apparently not entirely guaranteed.)
* vo_opengl: simplify a conditionGravatar wm42016-09-08
| | | | | | | | | | | The " || vimg->mpi" part virtually never seems to trigger, but on the other hand could possibly create unintended corner cases (for example by trying to upload a NULL image, which would then be marked as an error and render a blue screen). I guess it's a leftover from over times, where a NULL image meant "redraw the current frame". This is now handled by actually passing along the current frame.
* hwdec/opengl: Add support for CUDA and cuvid/NvDecodeGravatar Philip Langdale2016-09-08
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Nvidia's "NvDecode" API (up until recently called "cuvid" is a cross platform, but nvidia proprietary API that exposes their hardware video decoding capabilities. It is analogous to their DXVA or VDPAU support on Windows or Linux but without using platform specific API calls. As a rule, you'd rather use DXVA or VDPAU as these are more mature and well supported APIs, but on Linux, VDPAU is falling behind the hardware capabilities, and there's no sign that nvidia are making the investments to update it. Most concretely, this means that there is no VP8/9 or HEVC Main10 support in VDPAU. On the other hand, NvDecode does export vp8/9 and partial support for HEVC Main10 (more on that below). ffmpeg already has support in the form of the "cuvid" family of decoders. Due to the design of the API, it is best exposed as a full decoder rather than an hwaccel. As such, there are decoders like h264_cuvid, hevc_cuvid, etc. These decoders support two output paths today - in both cases, NV12 frames are returned, either in CUDA device memory or regular system memory. In the case of the system memory path, the decoders can be used as-is in mpv today with a command line like: mpv --vd=lavc:h264_cuvid foobar.mp4 Doing this will take advantage of hardware decoding, but the cost of the memcpy to system memory adds up, especially for high resolution video (4K etc). To avoid that, we need an hwdec that takes advantage of CUDA's OpenGL interop to copy from device memory into OpenGL textures. That is what this change implements. The process is relatively simple as only basic device context aquisition needs to be done by us - the CUDA buffer pool is managed by the decoder - thankfully. The hwdec looks a bit like the vdpau interop one - the hwdec maintains a single set of plane textures and each output frame is repeatedly mapped into these textures to pass on. The frames are always in NV12 format, at least until 10bit output supports emerges. The only slightly interesting part of the copying process is that CUDA works by associating PBOs, so we need to define these for each of the textures. TODO Items: * I need to add a download_image function for screenshots. This would do the same copy to system memory that the decoder's system memory output does. * There are items to investigate on the ffmpeg side. There appears to be a problem with timestamps for some content. Final note: I mentioned HEVC Main10. While there is no 10bit output support, NvDecode can return dithered 8bit NV12 so you can take advantage of the hardware acceleration. This particular mode requires compiling ffmpeg with a modified header (or possibly the CUDA 8 RC) and is not upstream in ffmpeg yet. Usage: You will need to specify vo=opengl and hwdec=cuda. Note that hwdec=auto will probably not work as it will try to use vdpau first. mpv --hwdec=cuda --vo=opengl foobar.mp4 If you want to use filters that require frames in system memory, just use the decoder directly without the hwdec, as documented above.
* vo_opengl: fix incorrect video rendering after vdpau preemption recoveryGravatar wm42016-09-07
| | | | | This could also trigger in certain other cases, whenever it falls back to dumb mode.
* vo_opengl: fix another potential vdpau preemption issueGravatar wm42016-09-07
| | | | | | reinit() will change the image params fields, so give it a copy. Will fix potential crashes if preemption happens more than once.
* vo_opengl: simplify option handlingGravatar wm42016-09-06
| | | | | | | | | | | | | Instead of copying the options around... just don't. video.c now has full control over when options are updated. (It still gets notified from outside, but it decides when the updated options are copied: when m_config_cache_update() is called.) So there's no need for tricky stuff, and it can be simplified a bit. Also change lcms.c. We could do it like video.c, and get the options from the global config store. But it seems simpler to just provide a pointer to an option struct, which is arbitrarily mutated from the outside (from the perspective of lcms.c).
* vo_opengl: fix --icc-profile initial behaviorGravatar wm42016-09-06
| | | | | | | | | Setting --icc-profile had no effect, until a vo_opengl option was changed at runtime. We must initialize the renderer for the initial option state too. For some reason, the ICC profile gets loaded twice. The next commit happens to fix this.
* vo_opengl: deprecate sub-options, add them as global optionsGravatar wm42016-09-02
| | | | | | | | | | | | | | | | | | | | | | | | vo_opengl sub-option were always rather annoying to handle. It seems better to make them global options instead. This is simpler and easier to use. The only disadvantage we are aware of is that it's not clear that many/all of these new global options work with vo_opengl only. --vo=opengl-hq is also deprecated. There is extensive compatibility with the old behavior. One exception is that --vo-defaults will not apply to opengl-hq (though with opengl it still works). vo-cmdline is also dysfunctional and will be removed in a following commit. These changes also affect opengl-cb. The update mechanism is still rather inefficient: it requires syncing with the VO after each option change, rather than batching updates. There's also no granularity (video.c just updates "everything", and if auto-ICC profiles are enabled, vo_opengl.c will fetch them on each update). Most of the manpage changes were done by Niklas Haas <git@haasn.xyz>.
* vo_opengl: rename 3dlut-size to icc-3dlut-sizeGravatar wm42016-09-02
| | | | | Not documenting this yet, because a later commit will change all the options anyway.
* vo_opengl: minor renderer option access refactorGravatar wm42016-09-02
| | | | | | | | | | | | | Reduce accesses to the renderer opts in vo_opengl.c, and instead add accessors for them to video.c. I suppose gamma and maybe icc-auto could be moved to vo_opengl.c options. Also, the output colorspace could probably be adjusted to what is really used, not just the options (although it's possible that this commit changes this, due to video.c mutating its own copy of the options according to actual renderer capapbilities). But don't deal with this now.
* vo_opengl: remove pre/post/scale-shadersGravatar Niklas Haas2016-09-02
| | | | | | | | | | Deprecated in favor of user-shaders, which are functionally equivalent but superior. (Except in the case of scaler-shader, which has no direct replacement, but it turned out to be a very unpopular feature either way - most custom scalers don't fit into the mpv kernel infrastructure and are therefore implemented as user shaders either way) Signed-off-by: wm4 <wm4@nowhere>
* vo_opengl: explicitly check for GL errors around framebuffer depth checkGravatar wm42016-08-29
| | | | | | | | It seems like many GL implementations (including Mesa) choke on this, while others are fine. We still think that this use of the GL API is allowed by the standard (at least in the Mesa case), so to reduce confusion, explicitly check the "controversial" calls, and use an appropriate error message.
* vo_opengl: angle: new opengl flag to control DirectCompositionGravatar Avi Halachmi (:avih)2016-08-25
| | | | | On some systems DirectComposition might behave poorly. Add an opengl suboption flag 'dcomposition' (default=yes) which can disable it.
* wayland_common: fix fullscreen image switching bugGravatar Rostislav Pehlivanov2016-07-30
| | | | | | | | | | | The problem was that when in fullscreen, switching between images did not issue a resize event, causing none of the images to be rendered correctly. This fixes the problem by issuing a resize event with the screen width and height. This commit also moves the zeroing of the events field to when it gets retrieved by mpv rather than randomly after a resize in the vo/backend code.
* vo_opengl: remove the 3dlut-size npot2 restrictionGravatar Niklas Haas2016-07-25
| | | | | | | | This requires changing the pixel upload alignment because the odd sizes might not be aligned to multiples of 4. Anyway, the restriction has no real benefit and the sizes in between 32 and 64 might be worth using, so just drop it.
* vo_opengl: reduce default 3dlut-size to 64x64x64Gravatar Niklas Haas2016-07-25
| | | | | | | | | | | | | | | | | | | | | | | | Following testing after ebe798a, this is a more than sufficient size to cover our use case. The old default was a drop of about 58 dB PSNR using the old code, and this new default is about 65 dB PSNR, so it's actually an improvement despite resulting in a smaller size. There was no outlier whatsoever when comparing sizes around the 64 neighbourhood (with every step corresponding to a PSNR drop of about 0.07 dB), so I picked this since it's a power of two and requires no change to the current 3dlut-size parsing logic. I also tested smaller sizes such as 32x32x32 which performed almost as well on colorful samples, but this results in noticeable black boost in the dark regions, which is pretty undesirable. Therefore, we should avoid going much further below 64x64x64. Either way, this new size is so fast to compute that the 3dlut cache is almost useless on my end. In fact, it might even be slower to load the profile from the cache than to recompute it from scratch. (For caches on a disk. For cache on a tmpfs, it makes no difference)
* vo_opengl: increase 3DLUT accuracy at lower LUT sizesGravatar Niklas Haas2016-07-25
| | | | | | | | | | | | | | | | | | | | | | | | This code had the exact same texture indexing bug that the original scaler code had before the introduction of the LUT_POS macro to fix it. We can re-use this same macro here, and the performance drop is virtually entirely negligible. The benefit is greatly improved LUT accuracy as the 3DLUT size decreases - in particular, the old LUT started introducing more and more black crush the lower your LUT size is (because the error was essentially an over-contrast bias, with a magnitude linearly related to the lut size). The new code improves black stability as the LUT size decreases, and only at very low values (16 and below) do black levels start noticeably getting affected (due to crude linearization of the nonlinear response curve). The default value of 3dlut-size is definitely generous enough for this to make no difference out of the box, but it also causes no performance drop at all on my machine so I see no harm in improving the logic. Furthermore, this means we could easily decrease the default 3dlut size in a future commit, perhaps even down to 64x64x64 as a default. (But more testing is warranted here)
* wayland: port to the new wakeup/wait_events frameworkGravatar Rostislav Pehlivanov2016-07-21
| | | | | | | | | | | | This fits natively into the vo/backend and allows to simplify the polling code. One new change is the fact that surface_handle_enter flags VO_EVENT_WIN_STATE and VO_EVENT_RESIZE instead of only VO_EVENT_WIN_STATE. Before this, the code hackily relied on the timeout and the loop in the wait_frame function to track and set the scaling factor. Instead, this triggers mpv to run a schedule_resize and adjust the new VO output dimensions immediately. This is also more accurate since surface_handle_enter() gets called when a surface is created, moved and resized, which is exactly what the rest of the player might be interested in.
* vo_opengl: add a tscale=linear direct implementationGravatar Niklas Haas2016-07-21
| | | | | | | | | | This uses GLSL mix() instead of going through an indirect texture access. Easy to implement and might require less resources on some devices, since the oversample code was already essentially just a special case of this. Could be made the new default (as per issue #2685), but that should be done in a separate commit.
* x11: stop using vo.event_fdGravatar wm42016-07-20
| | | | Instead let it do its own event loop wakeup handling.
* vo_opengl: allow backends to provide callbacks for custom event loopsGravatar wm42016-07-20
| | | | | | | Until now, this has been either handled over vo.event_fd (which should go away), or by putting event handling on a separate thread. The backends which do the latter do it for a reason and won't need this, but X11 and Wayland will, in order to get rid of event_fd.
* videotoolbox: add yuv420p to --videotoolbox-formatGravatar wm42016-07-15
|
* vo_opengl: hwdec: reset hw_subfmt fieldGravatar wm42016-07-15
| | | | | | In theory, mp_image_params with hw_subfmt set to non-0 if imgfmt is not a hwaccel format is invalid. (It worked fine because nothing checks this yet.)
* video: change hw_subfmt meaningGravatar wm42016-07-15
| | | | | | | | | | | | | | | | | | The hw_subfmt field roughly corresponds to the field AVHWFramesContext.sw_format in ffmpeg. The ffmpeg one is of the type AVPixelFormat (instead of the underlying hardware format), so it's a good idea to switch to this too for preparation. Now the hw_subfmt field is an mp_imgfmt instead of an opaque/API- specific number. VDPAU and Direct3D11 already used mp_imgfmt, but Videotoolbox and VAAPI had to be switched. One somewhat user-visible change is that the verbose log will now always show the hw_subfmt as image format, instead of as nonsensical number. (In the end it would be good if we could switch to AVHWFramesContext completely, but the upstream API is incomplete and doesn't cover Direct3D11 and Videotoolbox.)
* vo_opengl: angle: use WARP if there are no hw adaptersGravatar James Ross-Gowan2016-07-12
| | | | | | | | | | | | | | | This should get mpv working on Windows 7 machines without hardware accelerated graphics adapters. It already worked on Windows 8 and up because those systems would silently fall back to WARP if there was no graphics hardware installed. The normal MPGL_CAP_SW flag is not set, so unlike other opengl backends, this will choose a software adapter even if opengl:sw is not specified. The reason for this is, unlike on Linux, where vo_xv and vo_x11 can be used, mpv on Windows does not have any VO to fall back on when hardware acceleration isn't available, so if software adapters are rejected, the user won't see any video output when using the default settings. WARP seems to perform quite well, so it should be used in this case.
* vo_opengl: angle: try D3D9 when D3D11 fails eglInitializeGravatar James Ross-Gowan2016-07-11
| | | | | This will happen when D3D11 is present on the machine but the supported feature level is too low.